Kagan is a software architect with expertise in functional programming, applying AI to scalable distributed systems, and big data analytics. He has experience in various industries and both small companies and large corporations.
Some of his core technologies include Spark, Redshift, Cassandra, Neo4J, Microservices, and Kafka.Hire Kagan
Hands-on, lead engineer in big data processing, performance engineering, analysis and machine learning. Processing measurements of 500 million unique viewers watching 150 billion streams per year with 1.5 trillion real-time transactions across more than 180 countries.
Model identity of people, their movements across households, and places and device ownership, for effective targeting via advertisement product offerings.
Lead performance engineering efforts leading 10x+ performance improvement in connected component graph implementation used during aggregation step of household identity computation.
Develop POC for enterprise-wide monitoring infrastructure using best practice Akka 2.6 patterns with cluster sharing.
Key contributor to DotData’s core team, building the automatic feature generation platform which analyzes customer data and automatically extract thousands of features and feed to model factory where the best models leveraging the best features are presented to customers automatically, without writing any custom code.
Developed various FeatureGen operators such as: TemporalChange, TimeDifference, TargetEncoding, Histogram etc.
Developed architectural roadmap for increasing the performance and scalability of the Feature Factory module.
Established extensibility foundation of the platform to decouple modules and help speed up new feature development.
Lead architect for Tibco’s flagship Data Science platform, that is unique in its ability to deploy machine learning models across various execution environments, such as relational systems, R, Spark, Cloudera, etc.
Reverse engineer existing code developed over a decade and define the architectural roadmap to upgrade it to work with Spark 2 with AKKA messaging, as well as prepare it to run on the Cloud. Develop docker containers to simplify development and testing environments for the platform.
Responsible for architecture of the data platform addressing: acquisition, data processing, and publishing to consumers and analytics apps. Porting the data platform to AWS. Provide technical leadership to teams collaborating during AWS data lab summit.
Prototyped patient de-duplication POC using locality sensitive hashing algorithm from Spark ML, which demonstrated 25x performance improvement over existing implementation at 1/10th hardware.
Built a policy driven Identity and Access Management infrastructure, a strategic initiative for Change Healthcare to secure its data, metadata, jobs, and processes in its transformation to multi-tenant self-service platform, leveraging AWS IAM. Micro-services based platform built on AKKA to support 10B authorizations /day.
Built technology infrastructure of Oolong, to automatically pre-qualify leads, match buyers and sellers using sophisticated machine learning algorithms; and reach decision makers with marketing campaigns. Data model contains a graph of all manufacturing assets around the world, contact information of the decision makers and hierarchies of exhaustive list of models, parts, and they map to manufacturing processes.
Developed ETL pipeline for data transformation libraries for Spark and Salesforce, which were later contributed to open-source. They enabled spark developers to seamlessly interoperate with very large (10s of millions of rows) SalesForce objects as though they were regular Spark datasets, using SalesForce Bulk API.
Provided technical leadership to a team of 80+ engineers, responsible for a number of modules ranging from product recommendation engine, wish list, registry, product reviews, social media integration, promotions, as well as some of the highest visited pages, e.g. home page, the product details page etc. Hands-on prototyping of new functionality and modules such as recommendation engine.
Partnering with business stake-holders and executives in defining the technology roadmap. Responsibilities include: build vs. buy decisions, vendor evaluations, providing technical input to contract negotiations, hardware capacity planning, performance and scalability, data governance, etc. Highlights:
· Product Recommendation Engine driving ~10% of the sales – Scala, Spark, PredictionIO, Kafka, Flume
· Roadmap for Centralized Data Architecture that span organizational boundaries – Hive, HBase, Tableau, Scala, Vertica, Hadoop
· WishList and Wedding Registry applications replacing legacy implementations – Cassandra, Java
· Migration to Google Maps, requiring simultaneous coordination of 10s of teams across enterprise
· Architecture of highest priority omni-channel portfolio projects, such as Beauty Subscription box, Smart Sampling, etc.
Delivered turn-key enterprise applications, as well as infrastructure software, leveraging best of breed open-source technologies. Combination of expertise ranging from Data Architecture, SOA, MDA.
Projects for a number of large companies, including Cisco Systems, Google People Analytics, and eBay.