Real DevOps and Cloud Interview Questions: Prepare for DevOps, SRE, Cloud & Data Engineering Related Roles #3

Below is a curated list of real candidate experiences, shared directly via LinkedIn. A big thank you to everyone who contributed their real DevOps interview experience and questions and provided valuable insights. LinkedIn post links are included for reference. This page is intended to support the community—especially those preparing for DevOps/SRE/Cloud &Data Engineering related interviews or considering a job change.

List of all of our interview experience and questions can be FOUND HERE

Real DevOps interview experience and questions

Experience #1: Posted on July 7, 2025 (EY)

LinkedIn Post: https://www.linkedin.com/feed/update/urn:li:activity:7347855268489154561

𝐄𝐘 𝐃𝐚𝐭𝐚 𝐄𝐧𝐠𝐢𝐧𝐞𝐞𝐫 𝐈𝐧𝐭𝐞𝐫𝐯𝐢𝐞𝐰 𝐐𝐮𝐞𝐬𝐭𝐢𝐨𝐧𝐬 :

1.Explain your project architecture?
2. How much data u handled in day to day basis and what is the business case of your project?
3.What Cloud u used in your project? And questions then regarding aws services Like how u transmit the data from local path to aws S3 during extracting data from source
4. What is the cluster node for your EMR?
5. Write a sql query to join between dept emp sales to return sum and avg salary?
6.Write a spark code to evaluate dept wise 10th highest salary and top 6 salary using pyspark data frame?
7. What is scd and what is surrogate key why it is required?
8. What types of join in spark and why broadcast required?
9.What is the optimizations techniques you used in ur spark codes?
10.Some questions from pandas and what is the major difference between pandas and spark?

Experience #2: Posted on July 7, 2025 (HiLabs)

LinkedIn Post: https://www.linkedin.com/feed/update/urn:li:activity:7347864219578425345

Technical Interview Experience – DevOps Engineer at HiLabs (Round 1)

Here’s a fresh and practical set of questions from the interview:
🚀 CI/CD Workflows & GitHub Actions / Azure DevOps
✅ How do you manage pipeline failure handling and notify teams via Slack or Teams?
✅ What’s the best approach to deploying multiple services using a single pipeline without hardcoding configs?
✅ How do you handle rollback automatically using GitHub Actions or Azure DevOps?
✅ What strategies do you use for minimizing pipeline execution time during PR validations?

🚀 Kubernetes (AKS) & Container Management
✅ A pod crashes randomly under high load — what’s your process to debug and fix it?
✅ How do you securely inject secrets into containers running in AKS?
✅ What’s the difference between kubectl rollout restart and kubectl delete pod?
✅ How would you configure a Kubernetes cluster for zero-downtime deployments?

🚀 Azure Infrastructure & Cloud Automation
✅ How do you configure VNet integration for Azure App Services?
✅ Explain how Azure Key Vault integrates with Terraform and Azure DevOps pipelines.
✅ What’s the difference between managed identity and service principal in Azure — which one do you prefer and why?
✅ How do you structure Terraform for multi-team, multi-subscription environments?

🚀 Monitoring, Logging & Troubleshooting
✅ How would you investigate and resolve a memory leak in a containerized API?
✅ What’s the role of Application Insights vs Azure Monitor — when do you use which?
✅ How do you set up alerts when a Kubernetes pod restarts more than 3 times within 10 minutes?

🚀 Git & Collaboration
✅ How do you enforce PR checks and approval policies in a fast-moving DevOps team?
✅ What’s your branching strategy for multi-environment deployments using GitOps?
✅ How do you safely do a force push when collaborating in a shared repo?

Experience #3: Posted on July 9, 2025 (LogicLoop)

LinkedIn Post: https://www.linkedin.com/feed/update/urn:li:activity:7348779577214652418

My AWS DevOps Engineer Interview Experience at LogicLoop – Gurugram 🇮🇳

🧠 Interview Questions Asked:
1️⃣ What is Target Group?
2️⃣ Types of Load Balancer?
3️⃣ How to reboot a system?
4️⃣ How to schedule cron jobs?
5️⃣ What is df, du in Linux?
6️⃣ How to check logs in Linux?
7️⃣ What is a DaemonSet?
8️⃣ What is AutoScaling in Kubernetes (K8s)?
9️⃣ What is Control Plane?
🔟 What is a Master Node?
1️⃣1️⃣ Which monitoring tool are you using in your project?
1️⃣2️⃣ What is AWS Lambda?
1️⃣3️⃣ How many max. buckets can we create in S3?

Experience #4: Posted on July 7, 2025

LinkedIn Post: https://www.linkedin.com/feed/update/urn:li:activity:7347934970650206208

𝗖𝗿𝗮𝗰𝗸 𝗬𝗼𝘂𝗿 𝗡𝗲𝘅𝘁 𝗗𝗮𝘁𝗮 𝗘𝗻𝗴𝗶𝗻𝗲𝗲𝗿𝗶𝗻𝗴 𝗜𝗻𝘁𝗲𝗿𝘃𝗶𝗲𝘄 𝘄𝗶𝘁𝗵 𝗧𝗵𝗲𝘀𝗲 𝗣𝗿𝗼𝗷𝗲𝗰𝘁-𝗟𝗲𝘃𝗲𝗹 𝗤𝘂𝗲𝘀𝘁𝗶𝗼𝗻𝘀!

General Project-Based Questions

1. Can you walk me through your recent data engineering project?
2. What was the architecture of your project?
3. Which cloud platform did you use and why?
4. What was your role in the team?
5. What tools and technologies did you use in this project?
6. How did you handle large-scale data ingestion and processing?
7. What were some major challenges you faced and how did you overcome them?
8. How did you ensure data quality and integrity?
9. Explain any end-to-end ETL pipeline you built.
10. How did you automate the pipeline? Did you use any orchestration tool like Airflow or ADF?

Spark / PySpark / Scala Related Project Questions

11. How did you use Spark in your project?
12. What kind of transformations and actions did you use in PySpark/Scala?
13. Did you use RDD or DataFrame APIs? Why?
14. How did you handle performance tuning in Spark?
15. Explain a scenario where you used caching or persistence.
16. Have you implemented any join optimizations like broadcast joins or bucketing?
17. How did you manage memory and executor configurations in your job?
18. Did you face any skew issues? How did you resolve them?

Hive / SQL Related Project Questions

19. How did you use Hive in your project?
20. Did you use partitioning or bucketing in Hive? Explain.
21. What types of queries did you run on Hive?
22. How did you optimize slow Hive queries?
23. Explain a use-case where you used window functions or ranking in Hive.

AWS / Cloud Based Project Questions

24. How did you use AWS services like S3, EMR, Athena, Glue, or Redshift in your project?
25. How did you configure and run your Spark jobs on EMR?
26. Did you use IAM roles or any security best practices in AWS?
27. Explain how you transferred data from S3 to Redshift (or vice versa).
28. Have you used Athena to query S3 data? How?
29. What monitoring tools did you use in AWS (CloudWatch, logs, etc.)?
30. How did you trigger jobs using Lambda or ADF?

Real-Time & Batch Pipeline Questions

31. Did you work on real-time streaming? Which tools did you use?
32. Can you explain the difference between real-time and batch in your project?
33. How did you ensure exactly-once processing in Spark Structured Streaming?
34. How was the data stored — in what format and where?
35. How often was the batch job scheduled and monitored?

Experience #5: Posted on July 11, 2025

LinkedIn Post: https://www.linkedin.com/feed/update/urn:li:activity:7349482225152770049

#Jenkins Interview Questions:

******Interview Questions******

1. How do you pass parameters between stages in a Jenkins declarative pipeline?

2. What is an agent in Jenkins? How do you configure a pipeline to run on a specific agent?

3. Explain how you use shared libraries in Jenkins.

4. Have you handled parallel execution in a pipeline? How and why?

5. How do you implement approval gates in a Jenkins pipeline (e.g., manual approval before production)?

6. What is the difference between rebase and merge? Which one do you prefer in CI/CD workflows?

7. How do you manage version control for infrastructure (Terraform/Ansible) in Git?

8. How do you trigger a pipeline based on a Git tag push instead of a branch commit?

9. Have you used Git hooks or automation to enforce commit message standards?

10. How do you manage sensitive variables and secrets in Terraform?

11. What happens if someone manually changes infra outside of Terraform? How do you detect and fix it?

12. What is the difference between terraform taint and terraform import?

13. How do you organize Terraform code for a multi-environment setup (dev/stage/prod)?

14. What is the difference between an Azure Resource Group and AWS VPC?

15. How do you automate the provisioning of a virtual machine using Terraform on Azure (or EC2 on AWS)?

16. What is Azure DevOps YAML pipeline? How do you structure it for multi-stage deployment?

17. Explain the role of Azure Service Principal and how it is used in DevOps pipelines.

18. What cloud-native monitoring/logging solutions have you worked with (e.g., Azure Monitor, AWS CloudWatch)?

19. You have a containerized app running fine locally but failing on Jenkins – what steps do you take to debug it?

20. How do you use Kubernetes probes (liveness/readiness)? Why are they important?

21. How do you do Helm-based deployments in Kubernetes?

22. What’s the difference between StatefulSet and Deployment in Kubernetes?

23. How do you store and access persistent data inside a Kubernetes pod?

24. How do you audit and rotate credentials stored in DevOps tools?

25. What is your approach to shift-left testing in a DevOps pipeline?

26. What tools have you used for vulnerability scanning (e.g., Trivy, Aqua, etc.)?

27. How do you enable RBAC in Kubernetes or IAM in Azure/AWS to limit access to resources?

28. What is your method for post-deployment monitoring and alerts in a production environment?

Experience #6: Posted on July 7, 2025 (Infosys)

LinkedIn Post: https://www.linkedin.com/feed/update/urn:li:activity:7347914696454275074

Infosys DevOps Interview Experience.

● Tell me about yourself (Self-Intro)
● What source code management tool are you using?
● What CI/CD tools do you use?
● How do you troubleshoot issues in CI/CD pipelines?
● What kind of pipelines have you worked on?
● What is a Jenkinsfile?
● What are your daily DevOps activities?
● Explain Git merge and branching strategies.
● Describe a full CI/CD flow in your project.
● How do you build a Docker image? Explain the steps.
● How do you check CPU usage of a server?
● What operating systems have you worked with?
● What is Ansible? Why did you use it in your project?
● What is Terraform used for?
● What’s the difference between Ansible and Terraform?
● Have you worked with Kubernetes? Explain your use case.
● What AWS services are you familiar with?
● Have you used monitoring tools? How and why are they used?
● What do you do if there’s a Git issue while merging or pushing?
● What kind of Jenkins plugins have you used?
● How do you trigger an Ansible playbook? What’s the command?
● How to run a specific task in Ansible?
● How do you manage Jira tickets or handle tasks?

Experience #7: Posted on July 10, 2025 (PwC)

LinkedIn Post: https://www.linkedin.com/feed/update/urn:li:activity:7348924682911256577

PwC Interview experience
Position: DevOps Engineer

𝗥𝗼𝘂𝗻𝗱 𝟭: 𝐒𝐜𝐫𝐞𝐞𝐧𝐢𝐧𝐠 𝐑𝐨𝐮𝐧𝐝 (𝟑𝟎 𝐦𝐢𝐧𝐮𝐭𝐞𝐬)
1. Can you walk me through the architecture of your current project? What parts were you directly responsible for?
2. Over the last couple of years, which DevOps tools have you actually used hands-on?
3. Have you worked with AWS in production environments? Which services did you use, and for what purpose?
4. Let’s say you’ve deployed an app in Kubernetes — how would you make it accessible to users on the internet?
5. In your own words, what’s the role of a NAT Gateway in a cloud setup? Where have you used it?
6. Imagine a server is behaving oddly — how would you check what processes are currently running on it?
7. If your disk is getting full, how would you search for files larger than 100MB?
8. When would you use a Deployment vs a StatefulSet in Kubernetes? Have you worked with both?
9. What’s a ConfigMap in Kubernetes? And how is it different from a Secret?
10. How do you usually check if two servers can talk to each other over the network?
11. Tell me about a CI/CD pipeline you’ve worked on — what did it do, and how did you set it up?

Experience #8: Posted on July 10, 2025 (Deloitte)

LinkedIn Post: https://www.linkedin.com/feed/update/urn:li:activity:7348886884317622273

Deloitte DevOps Interview Questions

1. Describe a situation where you had to troubleshoot a technical issue under a tight deadline. What methodology did you follow?

2. Scenario: “Your CI build fails with ‘dependency not found.’ Outline your investigation steps.”

3. You notice intermittent 502 errors during canary deployment. How will you identify the root cause?

4. CI/CD pipeline takes 40+ minutes. What optimizations would you apply?

System Design & Architecture

5. Design a highly available logging/monitoring system for 100+ microservices across 3 regions.

6. How would you implement secure secret rotation (e.g., in Azure DevOps pipelines)?

7. Your production AKS cluster is failing health checks randomly — how do you debug?

8. Explain a rollback plan for a Kubernetes deployment using Terraform.

9. How do you handle zero-downtime schema migrations for a stateful database?

10. Design a cost-efficient nightly-reporting pipeline with 3-year log retention.

Incident Response & On-the-Spot Thinking

11. An Azure function is being throttled. How do you detect and mitigate it?

12. You receive intermittent DNS resolution failures in cloud infra—what could be wrong?

13. A critical end-user reports 10 sec latency spikes periodically — how do you root cause it?

14. One pod shows high CPU usage without logs—what’s your next step?

15. Artifact uploads from Jenkins randomly fail—which layers do you investigate?

Managerial & Behavioral

16. Tell me about a time you resolved a production incident with stakeholder pressure.

17. How do you prioritize when critical alerts pop up during an ongoing release?

18. Describe conflict resolution when working with distributed Dev and Ops teams.

19. Have you ever decided not to automate something? Explain your trade-offs.

20. What drives you to continue learning in cloud-native technologies?

Experience #9: Posted on July 8, 2025 (EPAM)

LinkedIn Post: https://www.linkedin.com/feed/update/urn:li:activity:7348383237037006850

EPAM DevOps Interview Questions Part-1

1. How would you design a scalable, highly available CI/CD system for microservices across multiple teams?

2. How would you manage cross-region deployments using Terraform in a multi-cloud setup?

3. How do you implement GitOps in a Kubernetes environment?

4. Can you explain how you would create a fully automated blue-green deployment in a Kubernetes-based microservices architecture?

5. How do you design an end-to-end DevSecOps pipeline for a fintech application with strict compliance requirements (e.g., PCI-DSS)?

6. What are some best practices for managing pipeline as code in large, distributed teams?

7. How would you dynamically provision ephemeral environments (dev/test) using pipelines?

8. In a monorepo setup, how do you ensure that only relevant services are built and deployed in a CI/CD pipeline?

9. How do you implement a canary deployment strategy with real-time monitoring rollback in a CI/CD system?

10. How do you manage secrets and config securely at scale in Kubernetes without compromising GitOps workflows?

11. Explain the control plane components of Kubernetes and how you would harden them for production use.

12. How would you scale a Kubernetes cluster horizontally across multiple regions and still ensure zero-downtime upgrades?

13. What is a PodDisruptionBudget and how do you use it in critical workloads?

14. How do you implement and manage network policies in Kubernetes for strict inter-service communication?

15. How would you refactor a legacy Terraform codebase used by multiple teams to follow best practices like DRY and modularity?

16. Explain the internals of how Terraform handles dependencies and graph building during the planning phase.

17. How do you manage and isolate Terraform state files across multiple environments and teams?

18. What’s your strategy to prevent and recover from a corrupted or deleted remote backend state file?

19. Have you implemented policy-as-code (e.g., Sentinel, OPA) with Terraform? Give a real use case.

20. How would you implement a centralized logging solution across multiple cloud platforms and environments?

21. What’s your approach to securing cloud-native DevOps infrastructure with Identity Federation (e.g., Azure AD + AWS IAM)?

22. How do you set up workload identity federation between GitHub Actions and Google Cloud / Azure securely?

23. How do you ensure cost-efficient auto-scaling of infrastructure in cloud when managing high workloads in CI/CD?

24. Explain a scenario where you had to design a disaster recovery (DR) strategy for DevOps infrastructure.

25. How do you enforce compliance and auditability in your CI/CD processes across global regions (e.g., GDPR, HIPAA)?

26. What’s your strategy for managing container image security across all stages of a DevOps pipeline?

27. How would you integrate runtime threat detection in Kubernetes using tools like Falco or Sysdig?

Experience #10: Posted on July 11, 2025 (Deloitte)

LinkedIn Post: https://www.linkedin.com/feed/update/urn:li:activity:7349342933684244482

Deloitte Interview Experience

𝗥𝗼𝘂𝗻𝗱 𝟭: 𝗧𝗲𝗰𝗵𝗻𝗶𝗰𝗮𝗹 𝗦𝗰𝗿𝗲𝗲𝗻𝗶𝗻𝗴
1. Explain the CI/CD workflow you follow and the kind of pipeline you use. How do you define and invoke pipelines in Jenkins?
2. What are shared libraries in Jenkins, and how are they written and defined?
3. What kind of applications do you deploy using Jenkins pipelines, and what deployment tools do you use?
4. If the Jenkins pipeline runs but the build doesn’t happen, what possible issues could be causing it?
5. What is the purpose of a webhook, and how is it used in a CI/CD pipeline?
6. How do you create and manage Kubernetes clusters (using tools like Terraform), and what are the master and worker nodes?
7. What are common Kubernetes errors you’ve faced (like CrashLoopBackOff, ImagePullError), and how did you resolve them?
8. What is the command to access a pod and how can you define or create a Kubernetes class or object?
9. Explain the folder structure of a basic Helm chart. What commands do you use to deploy with Helm?
10. What are the stages in a Docker image build? Why do we use ENTRYPOINT and CMD instructions?
11. How do you manage and connect services like DBs, EC2, EKS, or ECS? Include the command to connect to ECS.
12. Which container registry do you use for storing Docker images?

𝗥𝗼𝘂𝗻𝗱 𝟮: 𝗜𝗻-𝗱𝗲𝗽𝘁𝗵 𝗧𝗲𝗰𝗵𝗻𝗶𝗰𝗮𝗹 𝗦𝗰𝗿𝗲𝗲𝗻𝗶𝗻𝗴
1. What branching strategy do you follow, and how do you handle merges to avoid breaking the release branch? If a bug appears in production, what’s your approach to resolving it?
2. Describe your typical deployment flow and CI/CD workflow. What stages do you define in your Jenkins pipeline, and how do you ensure full quality checks during deployment?
3. How do you use Jenkins shared libraries? Explain their typical structure and how they are integrated into your Jenkinsfiles.
4. Are you aware of security scanning tools? How do you scan Docker images—both during build and at the registry level? Are you using any extensions or tools for image scanning?
5. How do you pass environment variables during Docker build commands? What services do you use for storing Docker images?
6. How do you establish a connection with databases in your deployments or infrastructure setup?
7. How do you handle authentication for EKS clusters and store secrets securely in your environment?
8. How do you create AWS Lambda functions and manage the artifacts for deployment? What options do you use to push artifacts to Lambda?
9. What is email signing and Helm chart signing? Which tools do you use to sign Helm charts?

𝗥𝗼𝘂𝗻𝗱 𝟯: 𝗛𝗠 𝗥𝗼𝘂𝗻𝗱 (𝗛𝗶𝗿𝗶𝗻𝗴 𝗠𝗮𝗻𝗮𝗴𝗲𝗿)
1. Project experiences.
2. Day-to-day responsibilities.
3. Light behavioral questions.
4. Teamwork & Culture Fit questions.

Experience #11: Posted on July 10, 2025

LinkedIn Post: https://www.linkedin.com/feed/update/urn:li:activity:7348953422563307520

🌱 Basics – Foundational Concepts
1️⃣ What is DevOps?
2️⃣ What are the main benefits of adopting DevOps in an organization?
3️⃣ What is CI/CD?
4️⃣ What is Infrastructure as Code (IaC)?
5️⃣ What is version control, and why is it important?

🛠️ Tools & Automation
6️⃣ What is Jenkins, and how does it help in CI/CD pipelines?
7️⃣ How does Docker work, and why is it popular for containerization?
8️⃣ What is Kubernetes, and what problems does it solve?
9️⃣ What is Terraform, and how does it differ from Ansible?
🔟 What is Ansible used for?

🧩 Intermediate – Real-World Scenarios
1️⃣1️⃣ What is a Blue-Green Deployment strategy?
1️⃣2️⃣ What is a Canary Deployment, and when would you use it?
1️⃣3️⃣ How do you secure sensitive credentials in a pipeline?
1️⃣4️⃣ What is immutable infrastructure?
1️⃣5️⃣ How do you roll back a failed deployment in Kubernetes?

🧭 Advanced – Monitoring & Observability
1️⃣6️⃣ What is observability, and how does it differ from monitoring?
1️⃣7️⃣ What tools would you use to monitor a Kubernetes cluster? (e.g., Prometheus, Grafana, ELK stack)
1️⃣8️⃣ Explain how you’d set up centralized logging for containerized applications.
1️⃣9️⃣ What is an SLI, SLO, and SLA?
2️⃣0️⃣ What is tracing, and why is it important in microservices?

⚙️ Orchestration & Containerization
2️⃣1️⃣ What is a Pod in Kubernetes?
2️⃣2️⃣ How does Kubernetes handle scaling and self-healing?
2️⃣3️⃣ What are Helm charts?
2️⃣4️⃣ How would you perform zero-downtime deployments in Kubernetes?
2️⃣5️⃣ What are StatefulSets vs. Deployments?

💡 Cloud & Security
2️⃣6️⃣ What is a Service Mesh (e.g., Istio)?
2️⃣7️⃣ How do you secure container images?
2️⃣8️⃣ What is the principle of least privilege?
2️⃣9️⃣ What is GitOps?
3️⃣0️⃣ How do you manage secrets in Kubernetes? (e.g., Sealed Secrets, HashiCorp Vault).

Experience #12: Posted on July 5, 2025 (Persistent Systems)

LinkedIn Post: https://www.linkedin.com/feed/update/urn:li:activity:7347340336450457600

🚀 My AWS DevOps Interview Experience at Persistent Systems – Bangalore

📌 Interview Questions:
1️⃣ Explain terraform init, terraform plan, terraform apply, terraform validate, terraform output, terraform refresh, terraform input.
2️⃣ How many plugins are installing in your project for Jenkins?
3️⃣ How many components are there in VPC?
4️⃣ Have you worked on CloudWatch?
5️⃣ Share your screen and write a Dockerfile and explain how you are building Docker images?
6️⃣ Can we use Load Balancer for a single instance?
7️⃣ How are you using pipeline triggering in Jenkins by using corncobs?

Experience #13: Posted on July 6, 2025 (Virtusa)

LinkedIn Post: https://www.linkedin.com/feed/update/urn:li:activity:7346174104984571905

Virtusa keeps asking these Data Engineering questions repeatedly.

1. What are the different execution modes in Apache Spark (local, standalone, YARN, Mesos)?
2. What’s the difference between reduceByKey and groupByKey in Spark and when to use each?
3. Explain the architecture of Apache Spark, including the driver, cluster manager, and worker nodes.
4. Compare Spark RDDs vs. DataFrames: use cases and performance implications.
5. Describe internal vs. external Hive tables and their use cases.
6. Explain partitioning in Hive – why it’s used and how it impacts performance.
7. How do you optimize SQL queries for performance? (talk about indexing, execution plans, projection pruning etc.)
8. Describe a data pipeline you’ve built – what tools (e.g., Airflow, PySpark) did you use and what challenges did you face?
9. Explain CI/CD in data engineering – how you implemented it in ETL pipelines.
10. How do you ensure data quality and validation in your pipelines?
11. What experience do you have with AWS data services (e.g., S3, EMR, Glue)? Provide a concrete example.
12. Describe a scenario where you used Spark Streaming with Kafka for real‑time analytics.
13. How have you implemented data modeling in big data contexts (e.g., star schema, snowflake)?
14. Discuss data lakehouse architecture and how it differs from traditional data lakes or warehouses.
15. Provide an example of a performance or scalability issue you faced in a pipeline and how you resolved it.

Experience #14: Posted on July 8, 2025 (RazorPay)

LinkedIn Post: https://www.linkedin.com/feed/update/urn:li:activity:7347610339619848192

Technical Interview Experience – DevOps Engineer at Razorpay (Round 1)

🚀 CI/CD, GitHub Actions & Deployment Strategy
✅ How do you structure GitHub Actions workflows to support multi-service deployment with rollback?
✅ Explain how to perform canary deployments using Helm and Argo Rollouts.
✅ How do you securely inject API keys in GitHub Actions without exposing them in logs?
✅ What’s the most efficient way to implement release versioning across microservices?

🚀 Kubernetes & Scalability
✅ How would you handle CPU throttling in a busy AKS/EKS cluster under payment traffic load?
✅ What are readiness gates in Kubernetes and when would you use them?
✅ Explain how you’d debug an issue where the HPA is not scaling pods despite high CPU usage.
✅ How do you ensure blue-green or shadow deployment testing for critical services?

🚀 Cloud Infrastructure & IaC (Terraform)
✅ How do you structure Terraform code for a multi-region, multi-account AWS setup?
✅ What is your approach to tagging and cost governance across cloud infrastructure?
✅ How do you rotate AWS IAM credentials and ensure services update them dynamically?
✅ What’s the benefit of using for_each over count in complex Terraform modules?

🚀 Observability & Debugging
✅ How do you trace a spike in failed transactions to either app logic, infra failure, or a third-party API?
✅ What’s your approach to building actionable alerts and reducing false positives in PagerDuty/Prometheus?
✅ How would you implement distributed tracing in a Kubernetes-based microservices system?

🚀 Security & Compliance (Fintech-Focused)
✅ How do you prevent hardcoded secrets in Terraform and Dockerfiles?
✅ What’s your strategy for enforcing least privilege access in a CI/CD pipeline?
✅ How do you manage vulnerability scanning and patching in high-frequency deployments?

Experience #15: Posted on July 6, 2025

LinkedIn Post: https://www.linkedin.com/feed/update/urn:li:activity:7348148042098712577

Kubernetes Scenario-Based Interview Questions – Part 3

1. You want to give every namespace its own set of resource quotas and default limits. How do you implement and enforce that?

2. An app requires elevated privileges to run Docker-in-Docker. How do you securely deploy it in Kubernetes?

3. After a deployment, latency increased significantly for your APIs. What do you check in Kubernetes to identify bottlenecks?

4. You have a CI/CD pipeline that deploys new pods but occasionally leaves old pods hanging. How do you clean these up automatically?

5. How do you allow a pod to run on a specific node only, using Kubernetes features?

6. Your cluster is nearing full disk on nodes due to image bloat. What are some Kubernetes-native ways to mitigate this?

7. You want to restrict a developer from deploying services of type LoadBalancer. How do you enforce it?

8. A team needs to run privileged pods, but you want to allow only a specific namespace to do so. How?

9. How do you prevent accidental deletion of critical resources (like ingress, configmaps) in production?

10. You need to debug a pod running in production without restarting or stopping it. What tools or methods can you use?

11. Your containerized app writes logs to a file instead of stdout. How do you capture and ship these logs?

12. Your cluster uses multiple Ingress controllers, and traffic is not routing correctly. How do you isolate and debug this?

13. How do you test whether your pod can access a specific internal Kubernetes service or endpoint?

14. You need to support blue-green deployments with quick rollback capability. How do you implement this in Kubernetes?

15. After deleting a namespace, it stays in Terminating state. What can you do to force its deletion?

16. You want to validate every YAML deployed to the cluster for security risks.What are your options?

17. How do you rotate TLS certificates used by Kubernetes services without downtime?

18. You need to backup and restore a Kubernetes cluster’s state. What tools or strategies do you recommend?

19. A deployment keeps restarting every few minutes even though the container is healthy. What might be wrong?

20. You want to inject a secret as a mounted file but ensure that no process can read it after the pod starts. Is this possible?

21. You’re seeing a lot of throttling in your app containers. How do you tune CPU limits to reduce this?

22. How do you isolate workloads in a multi-tenant cluster, ensuring both network and resource isolation?

23. A developer wants to deploy apps but should not be able to exec into pods. How do you configure this?

24. You want to track changes made to ConfigMaps and Secrets over time. How do you achieve version control?

25. You are required to implement geo-distributed failover across clusters. What would be your strategy?

Experience #16: Posted on July 8, 2025 (Deloitte)

LinkedIn Post: https://www.linkedin.com/feed/update/urn:li:activity:7348188721541754881

𝗗𝗲𝗹𝗼𝗶𝘁𝘁𝗲 asked these SQL questions in a Data Engineering interview

1.⁠ ⁠Calculate daily user retention for a 30-day cohort window.
2.⁠ ⁠Retrieve the latest 3 events per user using window functions.
3.⁠ ⁠Detect out-of-order events in time-series logs using timestamps.
4.⁠ ⁠Backfill missing dates in a partitioned dataset using calendar table.
5.⁠ ⁠Compute rolling 7-day distinct active users per platform.
6.⁠ ⁠Identify users with no activity in the last 30 days (churn analysis).
7.⁠ ⁠Perform multi-level aggregation (category → subcategory → product).
8.⁠ ⁠Compare current vs previous month revenue per region.
9.⁠ ⁠Join fact and dimension tables with SCD Type 2 handling.
10.⁠ ⁠Detect schema drift between raw and curated layers via metadata.
11.⁠ ⁠Calculate 95th percentile of transaction amounts by user.
12.⁠ ⁠Fetch top 3 categories contributing to 80% of total revenue.
13.⁠ ⁠Identify seasonal patterns in monthly product sales.
14.⁠ ⁠Use broadcast join hints to optimize small dimension lookups.
15.⁠ ⁠Flag revenue anomalies using z-score or standard deviation.
16.⁠ ⁠Remove duplicates using ROW_NUMBER on ingestion timestamp.
17.⁠ ⁠Calculate session time from login-logout events.
18.⁠ ⁠Compare revenue growth across last 3 quarters per product line.
19.⁠ ⁠Identify users who upgraded to premium but never used the features.
20.⁠ ⁠Track update frequency on records in an SCD Type 2 table.