Allan Joshua

199 14th ST NE Atlanta GA 30309

You can see the shorter PDF Version here

Employment - 14+ years of Experience

Equifax (Aug 2016 - present)

Sr.Big data Engineer(Machine Learning) Data and Analytics Innovation Team

  • Engaging with select customers to create proof of concepts and pilot projects for big data insights and predictive analytics using highly advanced mathematical modeling.
  • Develop Big data algorithms in Scala (Spark) and Java
  • Work with key stakeholders and understand their needs to develop new or improve existing solutions around data and analytics.
  • Partner in development of scalable solutions using large datasets with other data scientists on the team
  • Research innovative data solutions to solve real market problems.
  • Conceptualize, analyze and develop actionable recommendations for strategic challenges facing the organization.
  • Mine Big Data and other unstructured data to tap untouched data sources and deliver insight into new and emerging solutions
  • Work with cross-functional teams to develop ideas and execute business plans.


Seaview Research (Sep 2014 - Jul 2016)

Software Development Engineer

  • As part of minimizing the cost of IT operations, built a software which automated the QA process and brought down the timelines from 8-weeks (the industry standard for building a trial) down to 1 week, thereby bringing down the expense by a considerable amount
  • Suggested changes to help improve operational procedures for trial build.
  • Wrote code to export clinical data into a specific format to aid with setting up a new revenue stream.


Tata Consultancy Services - Retail Practice - Machine Learning - Recommendation Engine - an entire suite (Dec 2013 - Aug 2014)

Technical Lead

  • Tata consultancy services is a consulting giant which has a huge clientele across different domains. Within the Retail domain, the clientele includes several giants such as Best Buy, Walmart, The Homedepot, JCPenny etc.
  • As part of responding to an RFP for Big data analytics for JC Penny, designed/built a Product recommendation engine, encompassing several possible scenarios as a product for TCS.
  • Used several algorithms including Classification, Regression, Association Rules, etc
  • Wrote Java code using Apache Spark's API
  • Used JPMML to implement web-services to respond to real-time requests.


Tata Consultancy Services - Machine Learning - Association Rules, Product recommendation - POC (Dec 2013 - Aug 2014)

Technical Lead

  • Built a Product recommendation solution using Association Rules.
  • The POS data (Items from each bill) were extracted and imported into HDFS using Sqoop
  • Wrote Java code using Apache Spark's API to mine rules based on past purchases
  • Used Mahout's FPG algorithm for Association rules as well.
  • The final result was the different items, which are most likely to be purchased together in pairs, triplets etc.
  • The business team would use this information to strategically decide on promotions etc.


Tata Consultancy Services - Machine Learning - Classification of Customers into categories based on spending patterns - POC (May 2013 - Dec- 2013)

Technical Lead

  • Built a classification model to classify Homedepot’s customers as PRO customers or CONSUMERS
  • The modeling process involved identifying the patterns based on certain attributes which help set apart a PRO customer from a CONSUMER
  • Once the model was built, it was then executed/evaluated on a different customer dataset to make predictions as to whether each of the customers on the dataset (who have not yet been classified) is a PRO or a CONSUMER
  • The model learnt after the modeling process, extracted as a PMML file was then evaluated on different datasets to make predictions irrespective of the languages (Java,C etc), frameworks (Hadoop,Cascading etc) used for scoring.
  • Used Sqoop to load data from teradata to HDFS.
  • Used Apache Spark's MLLib to implement a classification algorithm.
  • Used JPMML to build a RESTFul web-service to evaluate the model.
  • Used Cascading Framework to evaluate the model at scale.


Tata Consultancy Services - The Homedepot - Machine Learning - Product Ranking in Product list page (Sep 2012 - May 2013)

Technical Lead

  • Designed and developed a MYSQL database to host data from Omniture, WCS and Endeca.
  • Designed and developed a Java program to read from the database and build the Features for optimization
  • Designed and developed a User Interface in HTML 5 to provide the ability to modify Objective function.
  • Designed and developed an Optimization algorithm to minimize the objective function.
  • Used NDCG metric as part of the Optimization algorithm to learn the weights from the previous week to predict a good ordering on the Product lists on the website for the next day. The list could be focused to target seasonal variations, to maximize margin, to maximize conversion etc based on the business de ned optimization objective.


Tata Consultancy Services - The Homedepot - Penske Truck Rental (Sep 2010 - Sep 2012)

Technical Lead

  • Designed and developed an application which spanned across multiple releases to provide the ability for renting Penske Trucks at homedepot locations.
  • This effort involved two entire cycles of the SDLC which involved interaction between several matrix teams within and outside the homedept (Penske) (The three efforts above (including this) included Preparation of System Architecture Design, Logical System Design, Physical System Design, Network Diagrams, Low level System Design Specification,packaging code using openMake into native packages for Linux, writing implementation plans and creating RFC's for deploying software policies to production servers).


Tata Consultancy Services - The Homedepot - ProRewards- A customer Loyalty program - (Nov 2009 - Sep 2010)

Technical Lead

  • Designed and developed an application which spanned across multiple releases to provide rewards to the customer based on the accrued spending.
  • Uncovered a security vulnerability for Flex applications at the Homedepot and developed a solution for the entire company to use.


Tata Consultancy Services - The Homedepot - Asset Protection- A management Survey tool (Mar 2009 - Nov 2009)

Technical Lead

  • Designed and developed an application on a blackberry device to help store managers perform Audits.


Tata Consultancy Services - CitiBank - Online Solutions for credit cards - (Nov 2007 - Mar 2009)

Software Developer

  • Developed an online solution for Citibank credit cards which included collections, credit card statements etc.


Polaris Software Labs - SmartBatch Core Framework (Sep 2005 - Sep 2007)

Software Developer

  • Developed a batch framework for parallel computing.



Stanford University CA, U.S.A

Graduate Student Artificial Intelligence Graduate Certificate 2010 - 2012

Anna University Bachelors in Computer Science and Enginnering, Chennai, India


Skills, Languages and Technologies

  • Technical proficiency, practical exposure and efficiency in the field of software development while working on diverse projects as part of engineering and work curriculum.
  • Supervised Learning-Discriminative and Generative models, Generalized Linear Models, Logistic Regression, Linear Regression, Support vector Machines, Neural networks
  • Unsupervised Learning - Kmeans Clustering, MeanShift, EM-Mixture of Gaussian model, Factor Analysis, PCA, ICA.
  • Experience with OpenCV library, Matlab, Octave
  • software development in Java, J2EE, RestFul Web-services
  • Database Design, SQL
  • Servers - Tomcat, Apache