Bogdan is a big data engineer with 13+ years of experience (7+ years of experience in Big Data) using various big data frameworks. Bogdan has worked on large scale / computer intensive / distributed projects for various company across the world, working with early stage start-ups and medium to large companies. Projects have been deployed either on premise or in AWS/GCP clouds. In addition, Bogdan has written big data code in Java, Scala, and Python.Hire Bogdan
Big data projects deployed in the cloud or on premise (Cloudera distribution) from automotive & telecommunications including a project where Bogdan was a platform technical lead for a leading automotive company. The platform processes and stores the industrial company from factories, test vehicles, packaging and it’s the main pillar of Industry 4.0 company’s strategy.
Developed both batch & streaming jobs on Apache Spark/Kafka/Kafka Streams/Apache Beam that processed high amounts of data and are used real time or offline by companies to improve their business and services.
Migrated existing DataProc (Spark, Scala) jobs to DataFlow (Apache Beam, Java)
Developed an API for exposing data processed by data pipelines, built by using Spring, Spring Boot, GKE & APIGEE
Tools/Frameworks used are mainly from GCP Platform or on premise Kafka/Hadoop: Java, Scala, Python, Maven, Apache Spark, Apache Kafka, Kafka Streams, KSQL, Hadoop, Yarn, DeltaLake, Hive/Impala, DataProc, DataFlow, Apache Beam, Kubernetes, PubSub, BigQuery, Cloud Composer/AirFlow, Memory Store/Redis, Compute Engine, Spring, Spring Boot, GCP Secrets etc.
Worked as a contractor on multiple projects for different clients including a large retargeting company that involved distributed digital marketing platform capable of processing billions of online events and sending a few millions emails per day with personalized recommendations of products and a platform self service onboarding and setup for clients
Created design & architecture documents, presented & sustained them to the approval board
Worked on the Spark based solution that is processing tracking events, process them using Spark Structured Streaming and stores them in Hadoop/Druid for reporting and alerting usage
Did data migration for privacy requirements – migrated a few billions of Cassandra rows to a solution for expiring data (TTL)
Tools/Frameworks used: Java, Scala, Python, Spring, EMR, Amazon Athena, Redshift, AWS Glue, Spark, Storm, Hive, AirFlow, Spring Boot, DataFlow/DataProc, BigQuery, Cassandra, Couchbase, Kafka, Docker, Angular, Grafana, Graphite, Kibana, Marathon, Mesos, Maven, Git, Intellij Idea, Jenkins
Developed from scratch a Big Data solution that replaces an existing bank system by reverse engineering existing code
Worked on a data transformation application that runs on Apache Spark cluster solution
Created UAT reports, deployment scripts, deployment strategy to QA/UAT/PROD environments using UBS deploy tool, and release strategy using Maven Release and Maven Assembly.
Performed code reviews
Tools/frameworks used: Java 1.7, Scala 2.11, Apache Spark 1.4.3, Spring, JUnit, Mockito, Bash scripts, Maven, Git, Intellij Idea, TeamCity, RedHat 6.6, Cygwin
Worked on a financial application used by IBM internal financial & solution consultants that supports non-maintenance engagements in a majority of countries around the world. Purpose of the application was to estimate the Revenue, Gross Profit and Pre-Tax Income of an engagement based on cost and price of labor, hardware, software and other elements, and based also on financial factors and other miscellaneous factors