Real Interview Questions: Prepare for DevOps, SRE, Cloud & Data Engineering Related Roles #6

Below is a curated list of real candidate experiences, shared directly via LinkedIn. A big thank you to everyone who contributed their real DevOps interview experience and questions and provided valuable insights. LinkedIn post links are included for reference. This page is intended to support the community—especially those preparing for DevOps/SRE/Cloud &Data Engineering related interviews or considering a job change.

List of all of our interview experience and questions can be FOUND HERE

Real DevOps interview experience and questions

Experience #1: Posted on September 1, 2025

Company: Undisclosed (AI Startup) , Role: Devops Engineer

LinkedIn Post: https://tinyurl.com/wxpyfkze

🔹 DevOps / CI/CD / Kubernetes Questions:

1️⃣ What is SCM, IaC, playbook, and a Docker image?
2️⃣ Explain Git and GitHub in simple terms.
3️⃣ How does GitLab CI/CD work?
4️⃣ What are the stages you have built in Jenkins pipelines?
5️⃣ How do you resolve pipeline errors in Jenkins?
6️⃣ How can you schedule a cron job in shell scripting?
7️⃣ What is Kubernetes and why do we use it?
8️⃣ If a pod crashes, how would you fix/debug it?
9️⃣ How do you secure a Terraform state file?
🔟 How do you manage secrets in DevOps projects?
1️⃣1️⃣ What is a Kubernetes Deployment vs StatefulSet?
1️⃣2️⃣ How does Service & Ingress work in Kubernetes?
1️⃣3️⃣ What is the difference between ConfigMap and Secret in Kubernetes?
1️⃣4️⃣ How do you perform rolling updates & rollbacks in Kubernetes?
1️⃣5️⃣ What are taints and tolerations in Kubernetes, and when would you use them?

🔹 AWS Cloud Questions :

1️⃣ How do you launch and configure an EC2 instance with user data?
2️⃣ How do you configure security groups vs NACLs in a VPC?
3️⃣ Explain how to design a VPC with public and private subnets.
4️⃣ What is the use of a NAT Gateway vs Internet Gateway?
5️⃣ How do you attach and mount EBS volumes to EC2?
6️⃣ How do you configure Auto Scaling for EC2 instances?
7️⃣ What is the difference between Application Load Balancer (ALB) and Network Load Balancer (NLB)?
8️⃣ How do you monitor EC2 using CloudWatch (metrics, alarms, dashboards)?
9️⃣ How do you set up CloudWatch log groups for an application?
🔟 How do you manage RDS backups and automated failover?
1️⃣1️⃣ How do you configure Route 53 for domain hosting and failover routing?
1️⃣2️⃣ What is the difference between Elastic IPs and Public IPs?

Experience #2: Posted on August 27, 2025

Company: MNC, Role: Devops/Cloud/SRE

LinkedIn Post: https://tinyurl.com/y4t72mvh

1. Explain about yourself.
2. 3. 4. Which flavour of Linux you know? Which flavours have you worked.
Which version of RHEL , have you worked?
How do you locate a file name having a string? How do you do this from root directory?
Locate it from sub directory as well.
5. How to search a string within a file? From all the sub directory file as well?
6. Systemd and system init diﬀerence or correlation.
7. How to check all serives were up after a system boot?
8. If any services didn’t start, how to troubleshoot?
9. On Aws EC2, how to check applications came up?
10. On AWS, how do you check logs?
11. Where do you configure min and maximum pod?
12. How do you check how many pods were started?
13. What are life cycles of a pod?
14. If any pods were failing to start, how do you start troubleshooting?
15. What are ways database can be installed in aws?
16. Explain on load Balancer in AWS?
17. Explain security groups and NACLs.
18. With example, explain application load balancing and network load balancing.
19. Explain, how do min and max pod configuration In load balancing?
20. How does a kubernetes know when to spin up an additional pod? Give the configuration
Details.
21. Explain about cluster in kubernetes.
22. In order to host an enterprise application, what are all things you would consider
23. Ingress and egress configuration. Explain in detail. How would you configure it.
24. How would you configure routing of application to accept ingress.
25. How do you handle database creeds in eks or k8?
26. Explain about secrets and configMap
27. How the configMap is used within application
28. How do you secure transport layer security
29. Across zone load balancing – explain the concept and provide details on how will you
configure
30. In aws load balancing, what’s the use of sticky session?

Experience #3: Posted on August 26, 2025

Company: Airtel, Role: Devops/SRE/Cloud

LinkedIn Post: https://tinyurl.com/nhc2ztvw

𝐊𝐮𝐛𝐞𝐫𝐧𝐞𝐭𝐞𝐬
Q. What is the difference between a Deployment and a StatefulSet in Kubernetes?
Q. When should you use a StatefulSet instead of a Deployment?
Q. Can you attach a volume to a Deployment? If yes, how is it different from a StatefulSet?
Q. What could cause a StatefulSet pod to fail when rescheduled to a different availability zone?
Q. How do PV/PVC behave across zones in EKS or Kubernetes in general?
Q. What is a DaemonSet and when would you use it?
Q. If you want two pods per node (instead of one), what alternatives to DaemonSet can you use?
Q. What is a Pod Disruption Budget (PDB) and how is it useful?
Q. How do you handle certificate rotation in on-prem Kubernetes clusters?
Q. What are the challenges with scheduling pods in a multi-node, multi-AZ setup?
Q. How does the Kubernetes scheduler decide where to place pods?
Q. What happens when a StatefulSet pod cannot mount its volume after moving to another node?

🔹 𝐓𝐞𝐫𝐫𝐚𝐟𝐨𝐫𝐦
Q. What are common challenges faced while working with Terraform?
Q. How do you handle state file management in Terraform?
Q. How do you detect and resolve drift in Terraform-managed infrastructure?
Q. How do you manage secrets securely in Terraform?
Q. Why should you use a remote backend for Terraform?

🔹 𝐀𝐖𝐒 & 𝐍𝐞𝐭𝐰𝐨𝐫𝐤𝐢𝐧𝐠
Q. What are all the possible ways to deploy an Nginx server on AWS?
Q. What are the pre-requisites for VPC peering between two VPCs?
Q. What problems occur when two VPCs have overlapping CIDR blocks?
Q. How can you enable communication between overlapping CIDR VPCs?
Q. What is a Transit Gateway and how does it help in VPC communication?
Q. How can a jump server be used in overlapping network scenarios?
Q. Can you explain transitive routing between VPCs A, B, and C?

🔹 𝐂𝐈/𝐂𝐃 – 𝐆𝐢𝐭𝐇𝐮𝐛 𝐀𝐜𝐭𝐢𝐨𝐧𝐬
Q. What CI/CD tools have you used in your current role?
Q. How are you integrating tools like SonarQube, Docker, and Trivy in your pipelines?
Q. How do you trigger a GitHub Actions workflow in another repository?
Q. What is the purpose of repository_dispatch in GitHub Actions?
Q. How would you trigger a CI/CD pipeline in Repo A from changes in Repo B?

Experience #4: Posted on August 26, 2025

Company: Transunion, Role: Data Engineer

LinkedIn Post: https://tinyurl.com/43ss823f

1. Can you explain how you would design a scalable data pipeline that ingests millions of credit records daily with minimal latency?
2. How do you handle schema evolution in a data lake environment?
3. What is the difference between batch processing and stream processing? In which scenarios would you prefer one over the other?
4. How would you design a data model for storing customer credit histories that supports both fast lookups and analytical queries?

5. Given a table of credit transactions, how would you write a SQL query to find customers whose credit utilization increased by more than 30% month over month?
6. What are indexes in a database, and how do they help in query performance? Can they sometimes slow down operations?
7. How would you optimize a slow-running SQL query on a 1 TB table?

8. Can you explain how you’ve used Spark for ETL pipelines? What are the common performance bottlenecks in Spark jobs and how do you fix them?
9. What’s the difference between a data warehouse (like Snowflake, Redshift, BigQuery) and a data lake (like S3, ADLS)?

10. How do you ensure data quality when building pipelines (for example, no duplicate customer records or missing values in critical fields)?
11. How do you handle Personally Identifiable Information (PII) in your pipelines to ensure compliance with regulations like GDPR or CCPA?

12. Suppose one of your pipelines that processes daily credit score updates is delayed by 4 hours – how would you troubleshoot and fix it?
13. Imagine you need to migrate a large on-premises Oracle data warehouse to AWS Redshift. What steps would you take to ensure a smooth migration?
14. How would you design a system to provide near-real-time fraud detection alerts based on credit card transaction streams

Experience #5: Posted on August 26, 2025

Company: Undisclosed, Role: Devops Engineer

LinkedIn Post: https://tinyurl.com/s4mxp88w

1) Tell me about your self
2) Explain your experience in CI/CD tools and how do you used it?
3) Give me some commands in Linux
4) Command to generate SSH Key
5) What if user lost SSH key?
6) Command to show memory usage & CPU processing’s?
7) Command to kill any one of process
8) How will you change user access or privileges?
9) What are GitHub actions?
10) Difference between GitHub actions & Jenkins?
11) Any Branching strategics you followed in your organization?
12) How much you are confident in Kubernetes & Docker?
13) How will you stop POD in K8s?
14) How will you replicate a POD?
15) Command to get logs in K8s?
16) what will you do if POD is not responding?
17) What will you do if POD is getting more load and we need to stay it healthy before it gets died
17) what is docker?
18) How docker is useful & how will you implement in your pipeline?
19) How will you find out merge conflicts?
20) what kind of tools will you prefer for SAST & DAST securities?
21) How will you manage your ServiceNow task assigned to you & what basis will you pick those and solve it?

2nd Round: –

1) How much experience you have in writing pipeline scripts?
2) Write a pipeline script of implementing some tools by Groovy language?
3) Have you created pipeline script end to end what kind of tools you used?
3) How will you create GitHub actions?
4) Did you gave any ideas to your team or project in DevOps to improve the deployments? if any what are they?
5) How much will you rate yourself about Linux & Python languages?
6) How far you are good in using Ai in DevOps? (GitHub Copilot)
7) How about Monitoring tools? what you have used in your organization?
8) How are alerts managed in Prometheus or Grafana?

Experience #6: Posted on August 20, 2025

Company: Hexaware, Role: Data Engineer

LinkedIn Post: https://tinyurl.com/3cuhk4fk

🔹 Round 1 – Interview Questions:
• Difference between cache and persist
• Different types of secret scopes in Databricks
• How broadcast join works
• Program to print a string in reverse
• Different transformations used in Databricks
• What is a partition in Databricks
• Different file reading modes (e.g., fail fast, permissive, drop malformed)
• Common sources and destination formats used in ADB
• How to store data in different layers (Bronze, Silver, Gold)
• Difference between union and unionAll
• Where do you apply cache in the code? (with example)
• What is a partition, and how many are allocated by default?
• Different components of Databricks
• Difference between count(column) and count(*)
• Difference between distinct and dropDuplicates
• Difference between Fact and Dimension tables
• Different ways to access data from ADLS to ADB
• How to create a mount point
• How to improve performance in a slow-running Spark job
• What is autoscaling in Databricks

Experience #7: Posted on August 20, 2025

Company: Nvidia, Role: Devops Engineer

LinkedIn Post: https://tinyurl.com/pmnp8kac

Round 1 – AI/HPC Scaling, K8s, Cloud & Linux
1. How would you auto-scale GPU nodes for training workloads without wasting GPU hours on idle pods?
2. A multi-cluster, multi-region AI training job fails halfway because one cluster runs out of GPU memory. How do you rebalance workloads live?
3. How do you configure Kubernetes taints and tolerations for GPU workloads?
4. How would you handle CUDA driver upgrades in K8s without disrupting thousands of running AI pods?
5. Explain how you’d pre-warm GPU nodes for massive AI inference traffic (e.g., ChatGPT-scale) with zero cold-start penalty.
6. How would you monitor GPU utilization in real-time in a Kubernetes cluster?

Round 2 – RCA, Fire Drills & GPU Chaos
1. How would you check if a GPU pod in Kubernetes is using the GPU assigned to it?
2. What are NCCL logs, and why are they important in distributed training?
3. Persistent storage for AI datasets starts showing 200ms+ latency. How do you pinpoint whether it’s the storage backend, the network, or the GPU node?
4. A Kubernetes GPU pod requests 16GB VRAM but only gets 12GB due to fragmentation. How do you detect and fix in real-time?
5. Your AI pipeline cost doubles in 24 hours with no infra change. Profiling shows a silent GPU resource leak. How do you hunt it down?

Round 3 – Leadership, Reliability Culture & Scaling Influence
1. How do you set up SLOs for both AI inference latency and batch training completion times without overprovisioning GPUs?
2. You’re told to implement multi-region AI inference failover without DNS-based routing. What’s your plan?
3. How do you justify infra cost for idle GPU pre-warming to leadership when each hour costs $30–$40 per GPU?

Experience #8: Posted on August 19, 2025

Company: J. P. Morgan, Role: Senior DevOps Engineer

LinkedIn Post: https://tinyurl.com/5yszcdyu

Round 1 & 2 – Technical Deep Dive

1. You’ve deployed an app to Azure Kubernetes Service (AKS) and it fails health checks randomly. How do you debug this end-to-end?

2. In a canary deployment to production, half the traffic returns 502, while others succeed. Walk us through your troubleshooting approach.

3. CI/CD pipeline takes 40 mins to deploy a small change. What would you do to optimize it?

4. You see high CPU usage in one pod, but logs look clean. What next?

5. You’re asked to design a highly available logging system for 100+ microservices across 3 regions. What tools and architecture would you suggest?

6. Production app works fine for internal users but fails for external ones (403 error). How will you isolate the issue?

7. How do you ensure secure and dynamic secret rotation in Azure DevOps pipelines?

8. Explain how you’d use Azure Application Gateway with Web Application Firewall for a sensitive banking application.

9. During an Azure deployment, you receive intermittent DNS resolution issues. What can be the causes?

10. A user reports 10-second delays every 15 minutes in an app running on AKS. No code changes happened. How would you begin RCA?

11. Jenkins jobs are randomly failing at the artifact upload step. What layers would you check?

12. How would you set up an automated rollback strategy in Kubernetes for failed deployments?

13. Design a cost-optimized cloud architecture for an internal reporting app that runs every night and stores logs for 3 years.

14. How do you handle zero-downtime database migrations in a distributed application?

15. What’s your approach to disaster recovery for stateful apps running on containers?

16. An Azure function is being throttled. How will you detect and fix it?

17. Define a plan for blue/green deployment with rollback on Azure using Terraform and pipelines.

18. How would you monitor end-to-end SLA for services involved in a payments pipeline?

19. Explain the difference in scaling strategies for compute-intensive vs I/O-intensive workloads in Azure.

20. Suppose your production pipeline is blocked due to missing approvals and stakeholders are unreachable. What will you do?

Experience #9: Posted on August 19, 2025

Company: Undisclosed, Role: Data Engineer

LinkedIn Post: https://tinyurl.com/2sme4yad

𝗦𝗤𝗟/𝗣𝘆𝘁𝗵𝗼𝗻
1. Write a query to return the top 3 products by revenue per category using window functions (ties handled deterministically).
2. From logins(user_id, login_date), compute the Day-1 retention: users who returned the day after their first login.
3. Find the smallest and largest number in an array
4. Check if a string is a subsequence of another string

𝗣𝘆𝘀𝗽𝗮𝗿𝗸
5. From events(user_id, ts, action), assign session IDs where gaps >30 min start a new session (window + lag).
6. Deduplicate events(event_id, user_id, ts, payload) keeping the latest row per event_id (watermarking optional).
7. Compute running total sales per user ordered by date; ensure correctness with late/ out-of-order records.
8. Read mixed schema Parquet (some files add age), and merge schema safely; explain partitioning and file sizing choices.

𝗔𝗗𝗙 / 𝗗𝗮𝘁𝗮𝗯𝗿𝗶𝗰𝗸𝘀
9. Design an incremental load from on-prem SQL to ADLS → Synapse using ADF (watermark/CDC, retries, idempotency).
10. ADF pipeline intermittently fails due to API throttling – how do you harden the pipeline (IR choice, concurrency, backoff, until/retry)?
11. In Databricks, build a bronze → silver → gold Delta pipeline with SCD Type 2 in silver; outline jobs/clusters and QA checks.
12. Compare Mapping Data Flows vs. Databricks notebooks for complex joins/aggregations on 2 TB – when do you pick each?

𝗦𝘆𝗻𝗮𝗽𝘀𝗲 / 𝗙𝗮𝗯𝗿𝗶𝗰
13. In Synapse Dedicated SQL, design a table and distribution strategy for a 1B-row fact table; justify hash vs. round-robin vs. replicate.
14. Implement PolyBase/OPENROWSET external tables to query Parquet in ADLS; discuss stats, result-set caching, and pitfalls.
15. In Microsoft Fabric, design a lakehouse with OneLake + Delta – explain shortcuts, medallion layers, and governance with domains.
16. Migrate an existing Synapse DW to Fabric Warehouse: outline compatibility gaps, cost model changes, and performance tuning steps.

Experience #10: Posted on August 15, 2025

Company: Morgan Stanley, Role: Python Data Engineer

LinkedIn Post: https://tinyurl.com/5n6h6pbp

1️⃣ Write a Python program to reverse a string without using built-in functions.

2️⃣ Given a list of integers, find the second largest element without sorting.

3️⃣ Implement a function to check if a string is a palindrome.

4️⃣ Write a Python program to count the frequency of each character in a string.

5️⃣ Given a list of numbers, remove all duplicates without using set().

6️⃣ Write a Python program to merge two sorted lists into one sorted list.

7️⃣ Implement a function to find the factorial of a number using recursion.

8️⃣ Write a Python program to find all prime numbers between 1 and 100.

9️⃣ Given a list of integers, find the pair whose sum is closest to a given target.

🔟 Implement a function to flatten a nested list (e.g., [1, [2, [3, 4]], 5] → [1, 2, 3, 4, 5]).

Experience #11: Posted on August 15, 2025

Company: Credit Suisse, Role: Azure Data Engineer

LinkedIn Post: https://tinyurl.com/ahsu9c7f

𝗣𝗿𝗲𝗹𝗶𝗺𝗶𝗻𝗮𝗿𝘆 𝗥𝗼𝘂𝗻𝗱 (𝗢𝗻𝗹𝗶𝗻𝗲 𝗔𝘀𝘀𝗲𝘀𝘀𝗺𝗲𝗻𝘁)

– SQL Coding (Joins, Window functions)
– Data Structures (arrays, strings coding problems – medium level)

𝟭𝘀𝘁 𝗥𝗼𝘂𝗻𝗱 (𝗧𝗲𝗰𝗵𝗻𝗶𝗰𝗮𝗹 𝗜𝗻𝘁𝗲𝗿𝘃𝗶𝗲𝘄 𝟭)

– Difference between Azure Synapse Dedicated SQL Pool vs Serverless SQL Pool
– ADLS Gen2 storage hierarchy, partitioning strategy, and performance considerations
– PolyBase vs COPY statement in Synapse
– Data ingestion using ADF (Copy Activity vs Dataflow)
– Delta Lake vs Parquet formats and advantages of Delta
– Spark narrow vs wide transformations, coalesce() vs repartition()
– Scheduling jobs in Azure Databricks (Job clusters, Workflows, ADF integration)
– Azure Data Lake Storage vs Blob Storage

𝟮𝗻𝗱 𝗥𝗼𝘂𝗻𝗱 (𝗧𝗲𝗰𝗵𝗻𝗶𝗰𝗮𝗹 𝗜𝗻𝘁𝗲𝗿𝘃𝗶𝗲𝘄 𝟮 – 𝗦𝗲𝗻𝗶𝗼𝗿 𝗠𝗮𝗻𝗮𝗴𝗲𝗿)

– Design a schema for storing stock trades (orders, customers, instruments) → normalized vs denormalized.
– Explain when to use Snowflake Schema vs Star Schema in a data warehouse
– What validations would you implement in a staging zone before loading curated data?
– When would you recommend Azure Stream Analytics vs Spark Structured Streaming?
– End to End Project Discussion.

𝗧𝗲𝗰𝗵𝗻𝗼-𝗠𝗮𝗻𝗮𝗴𝗲𝗿𝗶𝗮𝗹 𝗥𝗼𝘂𝗻𝗱

– Describe a time you had conflicting deadlines in two projects. How did you prioritize?
– What advantages does Delta Lake provide over plain Parquet in a financial use case?
– How do you size a Databricks cluster for heavy ETL workloads?
– Compare Synapse Dedicated SQL Pool vs Cosmos DB for transactional analytics.

Experience #12: Posted on August 14, 2025

Company: Amazon, Role: DevOps (Phone Interview Questions)

LinkedIn Post: https://tinyurl.com/2uubtkaf

1. Behavioral Questions (Amazon Leadership Principles Focus)

✓ a. “Tell me about a time when you had to solve a challenging problem under a tight deadline.”

✓ b.”Describe a situation where you had to work with a difficult colleague. How did you handle it?”

✓ c. “”Give an example of when you improved a process. What was the impact?”

✓ d. “Tell me about a time you took ownership of a project. How did you ensure its success?”

2. Technical Knowledge and Tools

✓ a.”What is CI/CD? Can you walk me through the process of setting up a simple CI/CD pipeline?”

✓ b. “What tools have you used for continuous integration and continuous delivery?”

✓ c.”Explain the role of Docker in DevOps. How would you deploy an application in Docker?”

✓ d.”What is Kubernetes, and how does it help in container orchestration?”

✓ e. “What is the difference between a monolithic and microservices-based architecture?”

✓ f. “How do you handle version control? What Git workflows have you used?”

3. Cloud Infrastructure Questions (AWS Focus)

✓ a.”What AWS services have you worked with? Which ones do you consider most important for a DevOps engineer?”

✓ b. “Can you explain how you would set up a VPC (Virtual Private Cloud) in AWS?”

✓ c. “What is EC2, and how would you use it to deploy a simple web application?”

✓ d. “What is IAM in AWS, and how do you manage access control?”

✓ e. “Explain the difference between AWS EC2 and Lambda.”

4. Scripting & Automation Questions

✓ **a.** “What scripting languages do you use? Can you give an example of a script you wrote to automate a task?”

✓ **b.** “Write a simple script that checks if a given file exists and outputs an appropriate message.”

✓ **c.** “How would you automate the deployment of an application using Jenkins?”

5. Problem-Solving and Troubleshooting

✓ a. “Imagine your team is deploying an application, but something breaks during the process. How would you troubleshoot it?”

✓ b. “If a production environment is underperforming, how would you go about diagnosing the issue?”

✓ c. “You’re getting a sudden spike in traffic. How do you ensure your system scales efficiently?”

6. System Design Basics

✓ a. “How would you design a CI/CD pipeline for a microservices-based application?”

✓ b.”What are the key factors to consider when designing a scalable, fault-tolerant cloud infrastructure?”

Experience #13: Posted on August 10, 2025

Company: Undisclosed, Role: Devops Engineer

LinkedIn Post: https://tinyurl.com/r7xbdvdv

📌 Topics & Questions I Faced:

● 🧑‍💻 Self Introduction & Real-Time Project
● 🔄 Git:
Difference between merge and revert
How to use cherry-pick
How to resolve merge conflicts
● 🖥️ Linux:
How to check CPU usage
● 🧪 Jenkins:
Pipeline stages (build, test, deploy)
How credentials are stored
Parallel execution in Jenkinsfile
● 🐳 Docker:
Multi-stage Dockerfile example
● ☸️ Kubernetes:
Auto-scaling setup (HPA)
Deployment stages (Deployment, ReplicaSet, Pod)
Which ones I used and why
● 🔁 CI/CD:
Experience with GitHub Actions and AWS CodePipeline
● 🖥️ AWS Lambda:
Use case and deployment
● 🌐 Hosting:
How to deploy a static website in AWS
● 🔐 Terraform:
Storing secrets securely
Difference between plan and apply
What is drift?
● ⚙️ Ansible vs Terraform
Why I used Ansible in my current project
● 🌐 AWS Networking:
VPC connectivity
Internet Gateway vs NAT Gateway

Experience #14: Posted on August 9, 2025

Company: Databricks, Role: Data Engineer

LinkedIn Post: https://tinyurl.com/bdecch3e

➡️ What is predictive optimisation ?
➡️ What is liquid clustering ?
In delta sharing, what is CFD? ➡️ Can you share delta tables with history?
➡️ In autoloader, if the table gtes truncated, how to recover the data ?
➡️ Explain how to implement DLT ?
➡️ Explain the optimizatins applied with your data in databricks ?
➡️ What is the volume of data processed each day ?
➡️ What are SQL warehouse and how has it made it easy now to connect to other tools
➡️ What is Unity Catalog
➡️ Why did your organisation shifted to UC managed tables and not external tables?

Experience #15: Posted on July 19, 2025

Company: Undisclosed, Role: Data Engineer

LinkedIn Post: https://tinyurl.com/mpmzpufp

𝗦𝗤𝗟
– Write a query to find the second highest salary from an employee table without using MAX in subquery.
– Explain the difference between INNER JOIN, LEFT JOIN, and FULL OUTER JOIN with examples.
– Given a sales table, find the top 3 products sold in each region using RANK or DENSE_RANK.

𝗣𝘆𝘁𝗵𝗼𝗻
– How do you handle a file that is too large to fit into memory in Python? Show an example.
– What is the difference between mutable and immutable data types? Give examples.
– Write Python code to merge multiple CSV files from a directory into a single Pandas DataFrame.

𝗣𝘆𝘀𝗽𝗮𝗿𝗸
– Explain the difference between narrow and wide transformations in Spark. Give examples.
– How do you perform joins in PySpark and handle skewed data?
– Given a large dataset, write PySpark code to calculate the average sales per region per month.

𝗔𝘇𝘂𝗿𝗲 𝗗𝗮𝘁𝗮𝗯𝗿𝗶𝗰𝗸𝘀
– Explain the difference between a cluster, job cluster, and all-purpose cluster in Databricks.
– How do you optimize Delta Lake performance in Databricks?
– How do you integrate Databricks with Azure Data Lake Storage for both reading and writing data?

𝗔𝘇𝘂𝗿𝗲 𝗗𝗮𝘁𝗮 𝗙𝗮𝗰𝘁𝗼𝗿𝘆
– What is the difference between mapping data flows and wrangling data flows?
– How do you implement incremental load in Azure Data Factory?
– Explain how triggers work in ADF and how to schedule pipelines.

𝗠𝗶𝗰𝗿𝗼𝘀𝗼𝗳𝘁 𝗙𝗮𝗯𝗿𝗶𝗰
– Explain the architecture of OneLake in Microsoft Fabric.
– How does Fabric differ from Azure Synapse Analytics and Power BI?
– How would you migrate existing ADF pipelines to Microsoft Fabric Data Factory?

𝗔𝘇𝘂𝗿𝗲 𝗦𝘆𝗻𝗮𝗽𝘀𝗲 𝗔𝗻𝗮𝗹𝘆𝘁𝗶𝗰𝘀
– Explain the difference between dedicated SQL pool and serverless SQL pool in Synapse.
– How do you partition data in Synapse for better query performance?
– What are materialized views in Synapse and when should you use them?

Experience #1: Posted on September 1, 2025

Company: Undisclosed (AI Startup) , Role: Devops Engineer

Experience #2: Posted on August 27, 2025

Company: MNC, Role: Devops/Cloud/SRE

Experience #3: Posted on August 26, 2025

Company: Airtel, Role: Devops/SRE/Cloud

Experience #4: Posted on August 26, 2025

Company: Transunion, Role: Data Engineer

Experience #5: Posted on August 26, 2025

Company: Undisclosed, Role: Devops Engineer

Experience #6: Posted on August 20, 2025

Company: Hexaware, Role: Data Engineer

Experience #7: Posted on August 20, 2025

Company: Nvidia, Role: Devops Engineer

Experience #8: Posted on August 19, 2025

Company: J. P. Morgan, Role: Senior DevOps Engineer

Experience #9: Posted on August 19, 2025

Company: Undisclosed, Role: Data Engineer

Experience #10: Posted on August 15, 2025

Company: Morgan Stanley, Role: Python Data Engineer

Experience #11: Posted on August 15, 2025

Company: Credit Suisse, Role: Azure Data Engineer

Experience #12: Posted on August 14, 2025

Company: Amazon, Role: DevOps (Phone Interview Questions)

Experience #13: Posted on August 10, 2025

Company: Undisclosed, Role: Devops Engineer

Experience #14: Posted on August 9, 2025

Company: Databricks, Role: Data Engineer

Experience #15: Posted on July 19, 2025

Company: Undisclosed, Role: Data Engineer

Leave a Comment Cancel Reply