Below is a curated list of real candidate experiences, shared directly via LinkedIn. A big thank you to everyone who contributed their real DevOps interview experience and questions and provided valuable insights. LinkedIn post links are included for reference. This page is intended to support the communityโespecially those preparing for DevOps/SRE/Cloud &Data Engineering related interviews or considering a job change.
List of all of our interview experience and questions can be FOUND HERE

Experience #1: Posted on September 1, 2025
Company: Undisclosed (AI Startup) , Role: Devops Engineer
LinkedIn Post: https://tinyurl.com/wxpyfkze
๐น DevOps / CI/CD / Kubernetes Questions:
1๏ธโฃ What is SCM, IaC, playbook, and a Docker image?
2๏ธโฃ Explain Git and GitHub in simple terms.
3๏ธโฃ How does GitLab CI/CD work?
4๏ธโฃ What are the stages you have built in Jenkins pipelines?
5๏ธโฃ How do you resolve pipeline errors in Jenkins?
6๏ธโฃ How can you schedule a cron job in shell scripting?
7๏ธโฃ What is Kubernetes and why do we use it?
8๏ธโฃ If a pod crashes, how would you fix/debug it?
9๏ธโฃ How do you secure a Terraform state file?
๐ How do you manage secrets in DevOps projects?
1๏ธโฃ1๏ธโฃ What is a Kubernetes Deployment vs StatefulSet?
1๏ธโฃ2๏ธโฃ How does Service & Ingress work in Kubernetes?
1๏ธโฃ3๏ธโฃ What is the difference between ConfigMap and Secret in Kubernetes?
1๏ธโฃ4๏ธโฃ How do you perform rolling updates & rollbacks in Kubernetes?
1๏ธโฃ5๏ธโฃ What are taints and tolerations in Kubernetes, and when would you use them?
๐น AWS Cloud Questions :
1๏ธโฃ How do you launch and configure an EC2 instance with user data?
2๏ธโฃ How do you configure security groups vs NACLs in a VPC?
3๏ธโฃ Explain how to design a VPC with public and private subnets.
4๏ธโฃ What is the use of a NAT Gateway vs Internet Gateway?
5๏ธโฃ How do you attach and mount EBS volumes to EC2?
6๏ธโฃ How do you configure Auto Scaling for EC2 instances?
7๏ธโฃ What is the difference between Application Load Balancer (ALB) and Network Load Balancer (NLB)?
8๏ธโฃ How do you monitor EC2 using CloudWatch (metrics, alarms, dashboards)?
9๏ธโฃ How do you set up CloudWatch log groups for an application?
๐ How do you manage RDS backups and automated failover?
1๏ธโฃ1๏ธโฃ How do you configure Route 53 for domain hosting and failover routing?
1๏ธโฃ2๏ธโฃ What is the difference between Elastic IPs and Public IPs?
Experience #2: Posted on August 27, 2025
Company: MNC, Role: Devops/Cloud/SRE
LinkedIn Post: https://tinyurl.com/y4t72mvh
1. Explain about yourself.
2. 3. 4. Which flavour of Linux you know? Which flavours have you worked.
Which version of RHEL , have you worked?
How do you locate a file name having a string? How do you do this from root directory?
Locate it from sub directory as well.
5. How to search a string within a file? From all the sub directory file as well?
6. Systemd and system init di๏ฌerence or correlation.
7. How to check all serives were up after a system boot?
8. If any services didnโt start, how to troubleshoot?
9. On Aws EC2, how to check applications came up?
10. On AWS, how do you check logs?
11. Where do you configure min and maximum pod?
12. How do you check how many pods were started?
13. What are life cycles of a pod?
14. If any pods were failing to start, how do you start troubleshooting?
15. What are ways database can be installed in aws?
16. Explain on load Balancer in AWS?
17. Explain security groups and NACLs.
18. With example, explain application load balancing and network load balancing.
19. Explain, how do min and max pod configuration In load balancing?
20. How does a kubernetes know when to spin up an additional pod? Give the configuration
Details.
21. Explain about cluster in kubernetes.
22. In order to host an enterprise application, what are all things you would consider
23. Ingress and egress configuration. Explain in detail. How would you configure it.
24. How would you configure routing of application to accept ingress.
25. How do you handle database creeds in eks or k8?
26. Explain about secrets and configMap
27. How the configMap is used within application
28. How do you secure transport layer security
29. Across zone load balancing – explain the concept and provide details on how will you
configure
30. In aws load balancing, whatโs the use of sticky session?
Experience #3: Posted on August 26, 2025
Company: Airtel, Role: Devops/SRE/Cloud
LinkedIn Post: https://tinyurl.com/nhc2ztvw
๐๐ฎ๐๐๐ซ๐ง๐๐ญ๐๐ฌ
Q. What is the difference between a Deployment and a StatefulSet in Kubernetes?
Q. When should you use a StatefulSet instead of a Deployment?
Q. Can you attach a volume to a Deployment? If yes, how is it different from a StatefulSet?
Q. What could cause a StatefulSet pod to fail when rescheduled to a different availability zone?
Q. How do PV/PVC behave across zones in EKS or Kubernetes in general?
Q. What is a DaemonSet and when would you use it?
Q. If you want two pods per node (instead of one), what alternatives to DaemonSet can you use?
Q. What is a Pod Disruption Budget (PDB) and how is it useful?
Q. How do you handle certificate rotation in on-prem Kubernetes clusters?
Q. What are the challenges with scheduling pods in a multi-node, multi-AZ setup?
Q. How does the Kubernetes scheduler decide where to place pods?
Q. What happens when a StatefulSet pod cannot mount its volume after moving to another node?
๐น ๐๐๐ซ๐ซ๐๐๐จ๐ซ๐ฆ
Q. What are common challenges faced while working with Terraform?
Q. How do you handle state file management in Terraform?
Q. How do you detect and resolve drift in Terraform-managed infrastructure?
Q. How do you manage secrets securely in Terraform?
Q. Why should you use a remote backend for Terraform?
๐น ๐๐๐ & ๐๐๐ญ๐ฐ๐จ๐ซ๐ค๐ข๐ง๐
Q. What are all the possible ways to deploy an Nginx server on AWS?
Q. What are the pre-requisites for VPC peering between two VPCs?
Q. What problems occur when two VPCs have overlapping CIDR blocks?
Q. How can you enable communication between overlapping CIDR VPCs?
Q. What is a Transit Gateway and how does it help in VPC communication?
Q. How can a jump server be used in overlapping network scenarios?
Q. Can you explain transitive routing between VPCs A, B, and C?
๐น ๐๐/๐๐ โ ๐๐ข๐ญ๐๐ฎ๐ ๐๐๐ญ๐ข๐จ๐ง๐ฌ
Q. What CI/CD tools have you used in your current role?
Q. How are you integrating tools like SonarQube, Docker, and Trivy in your pipelines?
Q. How do you trigger a GitHub Actions workflow in another repository?
Q. What is the purpose of repository_dispatch in GitHub Actions?
Q. How would you trigger a CI/CD pipeline in Repo A from changes in Repo B?
Experience #4: Posted on August 26, 2025
Company: Transunion, Role: Data Engineer
LinkedIn Post: https://tinyurl.com/43ss823f
1. Can you explain how you would design a scalable data pipeline that ingests millions of credit records daily with minimal latency?
2. How do you handle schema evolution in a data lake environment?
3. What is the difference between batch processing and stream processing? In which scenarios would you prefer one over the other?
4. How would you design a data model for storing customer credit histories that supports both fast lookups and analytical queries?
5. Given a table of credit transactions, how would you write a SQL query to find customers whose credit utilization increased by more than 30% month over month?
6. What are indexes in a database, and how do they help in query performance? Can they sometimes slow down operations?
7. How would you optimize a slow-running SQL query on a 1 TB table?
8. Can you explain how youโve used Spark for ETL pipelines? What are the common performance bottlenecks in Spark jobs and how do you fix them?
9. Whatโs the difference between a data warehouse (like Snowflake, Redshift, BigQuery) and a data lake (like S3, ADLS)?
10. How do you ensure data quality when building pipelines (for example, no duplicate customer records or missing values in critical fields)?
11. How do you handle Personally Identifiable Information (PII) in your pipelines to ensure compliance with regulations like GDPR or CCPA?
12. Suppose one of your pipelines that processes daily credit score updates is delayed by 4 hours – how would you troubleshoot and fix it?
13. Imagine you need to migrate a large on-premises Oracle data warehouse to AWS Redshift. What steps would you take to ensure a smooth migration?
14. How would you design a system to provide near-real-time fraud detection alerts based on credit card transaction streams
Experience #5: Posted on August 26, 2025
Company: Undisclosed, Role: Devops Engineer
LinkedIn Post: https://tinyurl.com/s4mxp88w
1) Tell me about your self
2) Explain your experience in CI/CD tools and how do you used it?
3) Give me some commands in Linux
4) Command to generate SSH Key
5) What if user lost SSH key?
6) Command to show memory usage & CPU processing’s?
7) Command to kill any one of process
8) How will you change user access or privileges?
9) What are GitHub actions?
10) Difference between GitHub actions & Jenkins?
11) Any Branching strategics you followed in your organization?
12) How much you are confident in Kubernetes & Docker?
13) How will you stop POD in K8s?
14) How will you replicate a POD?
15) Command to get logs in K8s?
16) what will you do if POD is not responding?
17) What will you do if POD is getting more load and we need to stay it healthy before it gets died
17) what is docker?
18) How docker is useful & how will you implement in your pipeline?
19) How will you find out merge conflicts?
20) what kind of tools will you prefer for SAST & DAST securities?
21) How will you manage your ServiceNow task assigned to you & what basis will you pick those and solve it?
2nd Round: –
1) How much experience you have in writing pipeline scripts?
2) Write a pipeline script of implementing some tools by Groovy language?
3) Have you created pipeline script end to end what kind of tools you used?
3) How will you create GitHub actions?
4) Did you gave any ideas to your team or project in DevOps to improve the deployments? if any what are they?
5) How much will you rate yourself about Linux & Python languages?
6) How far you are good in using Ai in DevOps? (GitHub Copilot)
7) How about Monitoring tools? what you have used in your organization?
8) How are alerts managed in Prometheus or Grafana?
Experience #6: Posted on August 20, 2025
Company: Hexaware, Role: Data Engineer
LinkedIn Post: https://tinyurl.com/3cuhk4fk
๐น Round 1 โ Interview Questions:
โข Difference between cache and persist
โข Different types of secret scopes in Databricks
โข How broadcast join works
โข Program to print a string in reverse
โข Different transformations used in Databricks
โข What is a partition in Databricks
โข Different file reading modes (e.g., fail fast, permissive, drop malformed)
โข Common sources and destination formats used in ADB
โข How to store data in different layers (Bronze, Silver, Gold)
โข Difference between union and unionAll
โข Where do you apply cache in the code? (with example)
โข What is a partition, and how many are allocated by default?
โข Different components of Databricks
โข Difference between count(column) and count(*)
โข Difference between distinct and dropDuplicates
โข Difference between Fact and Dimension tables
โข Different ways to access data from ADLS to ADB
โข How to create a mount point
โข How to improve performance in a slow-running Spark job
โข What is autoscaling in Databricks
Experience #7: Posted on August 20, 2025
Company: Nvidia, Role: Devops Engineer
LinkedIn Post: https://tinyurl.com/pmnp8kac
Round 1 โ AI/HPC Scaling, K8s, Cloud & Linux
1. How would you auto-scale GPU nodes for training workloads without wasting GPU hours on idle pods?
2. A multi-cluster, multi-region AI training job fails halfway because one cluster runs out of GPU memory. How do you rebalance workloads live?
3. How do you configure Kubernetes taints and tolerations for GPU workloads?
4. How would you handle CUDA driver upgrades in K8s without disrupting thousands of running AI pods?
5. Explain how youโd pre-warm GPU nodes for massive AI inference traffic (e.g., ChatGPT-scale) with zero cold-start penalty.
6. How would you monitor GPU utilization in real-time in a Kubernetes cluster?
Round 2 โ RCA, Fire Drills & GPU Chaos
1. How would you check if a GPU pod in Kubernetes is using the GPU assigned to it?
2. What are NCCL logs, and why are they important in distributed training?
3. Persistent storage for AI datasets starts showing 200ms+ latency. How do you pinpoint whether itโs the storage backend, the network, or the GPU node?
4. A Kubernetes GPU pod requests 16GB VRAM but only gets 12GB due to fragmentation. How do you detect and fix in real-time?
5. Your AI pipeline cost doubles in 24 hours with no infra change. Profiling shows a silent GPU resource leak. How do you hunt it down?
Round 3 โ Leadership, Reliability Culture & Scaling Influence
1. How do you set up SLOs for both AI inference latency and batch training completion times without overprovisioning GPUs?
2. Youโre told to implement multi-region AI inference failover without DNS-based routing. Whatโs your plan?
3. How do you justify infra cost for idle GPU pre-warming to leadership when each hour costs $30โ$40 per GPU?
Experience #8: Posted on August 19, 2025
Company: J. P. Morgan, Role: Senior DevOps Engineer
LinkedIn Post: https://tinyurl.com/5yszcdyu
Round 1 & 2 โ Technical Deep Dive
1. Youโve deployed an app to Azure Kubernetes Service (AKS) and it fails health checks randomly. How do you debug this end-to-end?
2. In a canary deployment to production, half the traffic returns 502, while others succeed. Walk us through your troubleshooting approach.
3. CI/CD pipeline takes 40 mins to deploy a small change. What would you do to optimize it?
4. You see high CPU usage in one pod, but logs look clean. What next?
5. Youโre asked to design a highly available logging system for 100+ microservices across 3 regions. What tools and architecture would you suggest?
6. Production app works fine for internal users but fails for external ones (403 error). How will you isolate the issue?
7. How do you ensure secure and dynamic secret rotation in Azure DevOps pipelines?
8. Explain how youโd use Azure Application Gateway with Web Application Firewall for a sensitive banking application.
9. During an Azure deployment, you receive intermittent DNS resolution issues. What can be the causes?
10. A user reports 10-second delays every 15 minutes in an app running on AKS. No code changes happened. How would you begin RCA?
11. Jenkins jobs are randomly failing at the artifact upload step. What layers would you check?
12. How would you set up an automated rollback strategy in Kubernetes for failed deployments?
13. Design a cost-optimized cloud architecture for an internal reporting app that runs every night and stores logs for 3 years.
14. How do you handle zero-downtime database migrations in a distributed application?
15. Whatโs your approach to disaster recovery for stateful apps running on containers?
16. An Azure function is being throttled. How will you detect and fix it?
17. Define a plan for blue/green deployment with rollback on Azure using Terraform and pipelines.
18. How would you monitor end-to-end SLA for services involved in a payments pipeline?
19. Explain the difference in scaling strategies for compute-intensive vs I/O-intensive workloads in Azure.
20. Suppose your production pipeline is blocked due to missing approvals and stakeholders are unreachable. What will you do?
Experience #9: Posted on August 19, 2025
Company: Undisclosed, Role: Data Engineer
LinkedIn Post: https://tinyurl.com/2sme4yad
๐ฆ๐ค๐/๐ฃ๐๐๐ต๐ผ๐ป
1. Write a query to return the top 3 products by revenue per category using window functions (ties handled deterministically).
2. From logins(user_id, login_date), compute the Day-1 retention: users who returned the day after their first login.
3. Find the smallest and largest number in an array
4. Check if a string is a subsequence of another string
๐ฃ๐๐๐ฝ๐ฎ๐ฟ๐ธ
5. From events(user_id, ts, action), assign session IDs where gaps >30 min start a new session (window + lag).
6. Deduplicate events(event_id, user_id, ts, payload) keeping the latest row per event_id (watermarking optional).
7. Compute running total sales per user ordered by date; ensure correctness with late/ out-of-order records.
8. Read mixed schema Parquet (some files add age), and merge schema safely; explain partitioning and file sizing choices.
๐๐๐ / ๐๐ฎ๐๐ฎ๐ฏ๐ฟ๐ถ๐ฐ๐ธ๐
9. Design an incremental load from on-prem SQL to ADLS โ Synapse using ADF (watermark/CDC, retries, idempotency).
10. ADF pipeline intermittently fails due to API throttling – how do you harden the pipeline (IR choice, concurrency, backoff, until/retry)?
11. In Databricks, build a bronze โ silver โ gold Delta pipeline with SCD Type 2 in silver; outline jobs/clusters and QA checks.
12. Compare Mapping Data Flows vs. Databricks notebooks for complex joins/aggregations on 2 TB – when do you pick each?
๐ฆ๐๐ป๐ฎ๐ฝ๐๐ฒ / ๐๐ฎ๐ฏ๐ฟ๐ถ๐ฐ
13. In Synapse Dedicated SQL, design a table and distribution strategy for a 1B-row fact table; justify hash vs. round-robin vs. replicate.
14. Implement PolyBase/OPENROWSET external tables to query Parquet in ADLS; discuss stats, result-set caching, and pitfalls.
15. In Microsoft Fabric, design a lakehouse with OneLake + Delta – explain shortcuts, medallion layers, and governance with domains.
16. Migrate an existing Synapse DW to Fabric Warehouse: outline compatibility gaps, cost model changes, and performance tuning steps.
Experience #10: Posted on August 15, 2025
Company: Morgan Stanley, Role: Python Data Engineer
LinkedIn Post: https://tinyurl.com/5n6h6pbp
1๏ธโฃ Write a Python program to reverse a string without using built-in functions.
2๏ธโฃ Given a list of integers, find the second largest element without sorting.
3๏ธโฃ Implement a function to check if a string is a palindrome.
4๏ธโฃ Write a Python program to count the frequency of each character in a string.
5๏ธโฃ Given a list of numbers, remove all duplicates without using set().
6๏ธโฃ Write a Python program to merge two sorted lists into one sorted list.
7๏ธโฃ Implement a function to find the factorial of a number using recursion.
8๏ธโฃ Write a Python program to find all prime numbers between 1 and 100.
9๏ธโฃ Given a list of integers, find the pair whose sum is closest to a given target.
๐ Implement a function to flatten a nested list (e.g., [1, [2, [3, 4]], 5] โ [1, 2, 3, 4, 5]).
Experience #11: Posted on August 15, 2025
Company: Credit Suisse, Role: Azure Data Engineer
LinkedIn Post: https://tinyurl.com/ahsu9c7f
๐ฃ๐ฟ๐ฒ๐น๐ถ๐บ๐ถ๐ป๐ฎ๐ฟ๐ ๐ฅ๐ผ๐๐ป๐ฑ (๐ข๐ป๐น๐ถ๐ป๐ฒ ๐๐๐๐ฒ๐๐๐บ๐ฒ๐ป๐)
– SQL Coding (Joins, Window functions)
– Data Structures (arrays, strings coding problems โ medium level)
๐ญ๐๐ ๐ฅ๐ผ๐๐ป๐ฑ (๐ง๐ฒ๐ฐ๐ต๐ป๐ถ๐ฐ๐ฎ๐น ๐๐ป๐๐ฒ๐ฟ๐๐ถ๐ฒ๐ ๐ญ)
– Difference between Azure Synapse Dedicated SQL Pool vs Serverless SQL Pool
– ADLS Gen2 storage hierarchy, partitioning strategy, and performance considerations
– PolyBase vs COPY statement in Synapse
– Data ingestion using ADF (Copy Activity vs Dataflow)
– Delta Lake vs Parquet formats and advantages of Delta
– Spark narrow vs wide transformations, coalesce() vs repartition()
– Scheduling jobs in Azure Databricks (Job clusters, Workflows, ADF integration)
– Azure Data Lake Storage vs Blob Storage
๐ฎ๐ป๐ฑ ๐ฅ๐ผ๐๐ป๐ฑ (๐ง๐ฒ๐ฐ๐ต๐ป๐ถ๐ฐ๐ฎ๐น ๐๐ป๐๐ฒ๐ฟ๐๐ถ๐ฒ๐ ๐ฎ โ ๐ฆ๐ฒ๐ป๐ถ๐ผ๐ฟ ๐ ๐ฎ๐ป๐ฎ๐ด๐ฒ๐ฟ)
– Design a schema for storing stock trades (orders, customers, instruments) โ normalized vs denormalized.
– Explain when to use Snowflake Schema vs Star Schema in a data warehouse
– What validations would you implement in a staging zone before loading curated data?
– When would you recommend Azure Stream Analytics vs Spark Structured Streaming?
– End to End Project Discussion.
๐ง๐ฒ๐ฐ๐ต๐ป๐ผ-๐ ๐ฎ๐ป๐ฎ๐ด๐ฒ๐ฟ๐ถ๐ฎ๐น ๐ฅ๐ผ๐๐ป๐ฑ
– Describe a time you had conflicting deadlines in two projects. How did you prioritize?
– What advantages does Delta Lake provide over plain Parquet in a financial use case?
– How do you size a Databricks cluster for heavy ETL workloads?
– Compare Synapse Dedicated SQL Pool vs Cosmos DB for transactional analytics.
Experience #12: Posted on August 14, 2025
Company: Amazon, Role: DevOps (Phone Interviewย Questions)
LinkedIn Post: https://tinyurl.com/2uubtkaf
1. Behavioral Questions (Amazon Leadership Principles Focus)
โ a. “Tell me about a time when you had to solve a challenging problem under a tight deadline.”
โ b.”Describe a situation where you had to work with a difficult colleague. How did you handle it?”
โ c. “”Give an example of when you improved a process. What was the impact?”
โ d. “Tell me about a time you took ownership of a project. How did you ensure its success?”
2. Technical Knowledge and Tools
โ a.”What is CI/CD? Can you walk me through the process of setting up a simple CI/CD pipeline?”
โ b. “What tools have you used for continuous integration and continuous delivery?”
โ c.”Explain the role of Docker in DevOps. How would you deploy an application in Docker?”
โ d.”What is Kubernetes, and how does it help in container orchestration?”
โ e. “What is the difference between a monolithic and microservices-based architecture?”
โ f. “How do you handle version control? What Git workflows have you used?”
3. Cloud Infrastructure Questions (AWS Focus)
โ a.”What AWS services have you worked with? Which ones do you consider most important for a DevOps engineer?”
โ b. “Can you explain how you would set up a VPC (Virtual Private Cloud) in AWS?”
โ c. “What is EC2, and how would you use it to deploy a simple web application?”
โ d. “What is IAM in AWS, and how do you manage access control?”
โ e. “Explain the difference between AWS EC2 and Lambda.”
4. Scripting & Automation Questions
โ **a.** “What scripting languages do you use? Can you give an example of a script you wrote to automate a task?”
โ **b.** “Write a simple script that checks if a given file exists and outputs an appropriate message.”
โ **c.** “How would you automate the deployment of an application using Jenkins?”
5. Problem-Solving and Troubleshooting
โ a. “Imagine your team is deploying an application, but something breaks during the process. How would you troubleshoot it?”
โ b. “If a production environment is underperforming, how would you go about diagnosing the issue?”
โ c. “Youโre getting a sudden spike in traffic. How do you ensure your system scales efficiently?”
6. System Design Basics
โ a. “How would you design a CI/CD pipeline for a microservices-based application?”
โ b.”What are the key factors to consider when designing a scalable, fault-tolerant cloud infrastructure?”
Experience #13: Posted on August 10, 2025
Company: Undisclosed, Role: Devops Engineer
LinkedIn Post: https://tinyurl.com/r7xbdvdv
๐ Topics & Questions I Faced:
โ ๐งโ๐ป Self Introduction & Real-Time Project
โ ๐ Git:
Difference between merge and revert
How to use cherry-pick
How to resolve merge conflicts
โ ๐ฅ๏ธ Linux:
How to check CPU usage
โ ๐งช Jenkins:
Pipeline stages (build, test, deploy)
How credentials are stored
Parallel execution in Jenkinsfile
โ ๐ณ Docker:
Multi-stage Dockerfile example
โ โธ๏ธ Kubernetes:
Auto-scaling setup (HPA)
Deployment stages (Deployment, ReplicaSet, Pod)
Which ones I used and why
โ ๐ CI/CD:
Experience with GitHub Actions and AWS CodePipeline
โ ๐ฅ๏ธ AWS Lambda:
Use case and deployment
โ ๐ Hosting:
How to deploy a static website in AWS
โ ๐ Terraform:
Storing secrets securely
Difference between plan and apply
What is drift?
โ โ๏ธ Ansible vs Terraform
Why I used Ansible in my current project
โ ๐ AWS Networking:
VPC connectivity
Internet Gateway vs NAT Gateway
Experience #14: Posted on August 9, 2025
Company: Databricks, Role: Data Engineer
LinkedIn Post: https://tinyurl.com/bdecch3e
โก๏ธ What is predictive optimisation ?
โก๏ธ What is liquid clustering ?
In delta sharing, what is CFD? โก๏ธ Can you share delta tables with history?
โก๏ธ In autoloader, if the table gtes truncated, how to recover the data ?
โก๏ธ Explain how to implement DLT ?
โก๏ธ Explain the optimizatins applied with your data in databricks ?
โก๏ธ What is the volume of data processed each day ?
โก๏ธ What are SQL warehouse and how has it made it easy now to connect to other tools
โก๏ธ What is Unity Catalog
โก๏ธ Why did your organisation shifted to UC managed tables and not external tables?
Experience #15: Posted on July 19, 2025
Company: Undisclosed, Role: Data Engineer
LinkedIn Post: https://tinyurl.com/mpmzpufp
๐ฆ๐ค๐
– Write a query to find the second highest salary from an employee table without using MAX in subquery.
– Explain the difference between INNER JOIN, LEFT JOIN, and FULL OUTER JOIN with examples.
– Given a sales table, find the top 3 products sold in each region using RANK or DENSE_RANK.
๐ฃ๐๐๐ต๐ผ๐ป
– How do you handle a file that is too large to fit into memory in Python? Show an example.
– What is the difference between mutable and immutable data types? Give examples.
– Write Python code to merge multiple CSV files from a directory into a single Pandas DataFrame.
๐ฃ๐๐๐ฝ๐ฎ๐ฟ๐ธ
– Explain the difference between narrow and wide transformations in Spark. Give examples.
– How do you perform joins in PySpark and handle skewed data?
– Given a large dataset, write PySpark code to calculate the average sales per region per month.
๐๐๐๐ฟ๐ฒ ๐๐ฎ๐๐ฎ๐ฏ๐ฟ๐ถ๐ฐ๐ธ๐
– Explain the difference between a cluster, job cluster, and all-purpose cluster in Databricks.
– How do you optimize Delta Lake performance in Databricks?
– How do you integrate Databricks with Azure Data Lake Storage for both reading and writing data?
๐๐๐๐ฟ๐ฒ ๐๐ฎ๐๐ฎ ๐๐ฎ๐ฐ๐๐ผ๐ฟ๐
– What is the difference between mapping data flows and wrangling data flows?
– How do you implement incremental load in Azure Data Factory?
– Explain how triggers work in ADF and how to schedule pipelines.
๐ ๐ถ๐ฐ๐ฟ๐ผ๐๐ผ๐ณ๐ ๐๐ฎ๐ฏ๐ฟ๐ถ๐ฐ
– Explain the architecture of OneLake in Microsoft Fabric.
– How does Fabric differ from Azure Synapse Analytics and Power BI?
– How would you migrate existing ADF pipelines to Microsoft Fabric Data Factory?
๐๐๐๐ฟ๐ฒ ๐ฆ๐๐ป๐ฎ๐ฝ๐๐ฒ ๐๐ป๐ฎ๐น๐๐๐ถ๐ฐ๐
– Explain the difference between dedicated SQL pool and serverless SQL pool in Synapse.
– How do you partition data in Synapse for better query performance?
– What are materialized views in Synapse and when should you use them?