Senior Python Developer with Big Data experience (PySpark)
Welcome Bonus 4000$!
The Enterprise Analytics team at the customer’s company has an open position for a Senior Python developer with Big Data experience (PySpark). The team builds platforms to provide insights to internal and external clients of customer’s businesses in auto property damage and repair, medical claims, and telematics data. The customer’s solutions include analytical applications for claim processing, workflow productivity, financial performance, client and consumer satisfaction, and industry benchmarks.
Data engineers use big data technology to create best-in-industry analytics capability. This position is an opportunity to use Hadoop and Spark ecosystem tools and technology for micro-batch and streaming analytics. Data behaviors include ingestion, standardization, metadata management, business rule curation, data enhancement, and statistical computation against data sources that include relational, XML, JSON, streaming, REST API, and unstructured data. The role has responsibility to understand, prepare, process and analyze data to drive operational, analytical and strategic business decisions.
The Data Engineer will work closely with product owners, information engineers, data scientists, data modelers, infrastructure support and data governance positions. We look for engineers who start with 2-3 years of experience in the big data arena but who also love to learn new tools and techniques in a big data landscape that is endlessly changing.
About Exadel - Who We Are:
Since 1998, Exadel has been engineering its own software products and custom software for clients of all sizes. Headquartered in Walnut Creek, California, Exadel currently has 2000+ employees in development centers across America, Europe, and Asia. Our people drive Exadel’s success, and they are at the core of our values, so Exadel is a people-first cultured company.
About Our Customer:
The customer is an American company based in Chicago. It accelerates digital transformation for the insurance and automotive industries with AI, IoT, and workflow solutions.
About Our Project:
The customer has been working on next-generation analytics since 2018 and has migrated to Amazon EMR for cloud big data platform services. The customer provides software products and services to insurance companies, repair shops, OEMs, parts suppliers, and others, and has a variety of products in auto physical damage, casualty, telematics, and parts domain. All of these applications share data with the analytics team to build an enterprise data lake, which also allows the customer to do next-generation analytics on the amassed data.
Hortonworks is the current vendor. It will be replaced by Amazon EMR. Tableau is going to be the BI vendor. Microstrategy currently exists and will be phased out in early 2023.
All data is sent to the data lake, and the customer can do industry reporting. These data are used by a data science team to build new products and an AI model.
We will be moving to real-time streaming using Kafka and S3. We are doing POC to use Dremio and Presto for the query engine.
We're migrating to version 2.0 using Amazon EMR and S3, and Query engine is bucketed under 2.0 project.
- Cross product analytics
- Analytics for every new product customer has. Analytics team products is how the customer sells the products value to clients
- Quarterly Business Review meetings use data to explain how customer’s product is helping clients in their business
- You'll get to work with a cross-functional team
- You will learn customer’s company business
Project Tech Stack:
Technologies used are all open source: Hadoop, Hive, PySpark, Airflow, and Kafka, to name a few
- Proficiency in Python and PySpark
- 5+ years’ experience building, maintaining, and supporting complex data flows with structural and unstructural data
- Experience working with distributed applications
- Hands-on experience with HDFS / or HIVE / or SQOOP
- Ability to use SQL for data profiling and data validation
- Master’s or Bachelor’s degree
Nice to have:
- Understanding of AWS ecosystem and services such as EMR and S3
- Familiarity with Apache Kafka and Apache Airflow
- Experience in Unix commands and scripting
- Experience and understanding of Continuous Integration and Continuous Delivery (CI/CD)
- Understanding in performance tuning in distributed computing environment (such as Hadoop cluster or EMR)
- Familiarity with BI tools (such as Tableau or MicroStrategy)
We expect the candidate to have the necessary experience or understanding of the equivalent tools. The above-mentioned developments build up the project’s ecosystem.
- Build end-to-end data flows from sources to fully curated and enhanced data sets. This can include the effort to locate and analyze source data, create data flows to extract, profile, and store ingested data, define and build data cleansing and imputation, map to a common data model, transform to satisfy business rules and statistical computations, and validate data content
- Modify, maintain, and support existing data pipelines to provide business continuity and fulfill product enhancement requests
- Provide technical expertise to diagnose errors from production support teams
- Coordinate within on-site teams as well as work seamlessly with the US team
- Our ideal candidate will develop and maintain exceptional SQL code bases and expand our capability through Python scripting
Advantages of Working with Exadel:
- You can build your expertise with our Sales Support team, who provide assistance with existing and potential projects
- You can join any Exadel Community or create your own to communicate with like-minded colleagues
- You can participate in continuing education as a mentor or speaker
- You can take part in internal and external meetups as a speaker or listener. We support you in broadening your horizons and encourage knowledge sharing for all of our employees
- You can learn English with the support of native speakers
- You can take part in cultural, sporting, charity, and entertainment events
- Working at Exadel means always upgrading your skills and proficiency, so we provide plenty of opportunities for professional development. If you’re looking for a challenge that will lead you to the next level of your career, you’ve found the right place
- We work hard to ensure honest and open relations between employees and leadership, so our offices are friendly environments