Sr. Data Scientist/Architect

MarkiTech

  • Toronto
About the Job
Looking for Sr. Data Scientist / Architect with telecom experience.
SUMMARY:
  • An accomplished and performance-driven professional with over Min. 7+ years of technical & managerial experience in IT sector across technical consulting, designing a future proof of Data Architect for Telecom, Network preferably or in Billing, IN and VAS, DWH, CRM, CC, HRIS, Big Data Tools, RDBMS, ERP, POS & Marketing Data Analysis.
  • Expertise in Business Intelligence, Data Warehousing, and Reporting tools in the Telecom industry
  • Having 4 years experience working with Tableau Desktop, Tableau Server in various versions including Tableau
  • Having 4 years of work experience with statistical data analysis such as linear models, multivariate analysis, statistical Analysis, Data Mining and Machine Learning techniques.
  • Expertise working with statistical data analysis such as linear models, Statistical Analysis, and Machine Learning techniques.
  • Hands – on experience Python to develop analytic models and solutions mapper & reducer.
  • Hands-on experience in creating insightful Tableau worksheets, dashboards to generate segment analysis and financial forecasting reports.
  • Proficient in creating data modeling for 360 customer view & customer behavior.
  • Strong skillset in PLSQL, ETLS, Business Intelligence, SQL Server Integration Server (SSIS) and SQL Server Reporting Services (SSRS).
  • Proficient in Data Cleansing and Data Validation checks during staging before loading the data into the Data warehouse.
  • Highly proficient at using PLSQL for developing complex Stored Procedures, Triggers, Indexes, Tables, User Defined procedures, Relational Database models and SQL joins to support data manipulation and conversion tasks.
  • Highly skilled in creating, maintaining, and deploying Extract, Transform and Load(ETL) packages to Integration Server using Project Deployment and Package Deployment models.
  • Outstanding interpersonal communication, problem solving, documentation and business analytical skills.

TECHNICAL SKILLS:
  • Data Analytics Tools/Programming: Python (numpy,scipy, pandas), MATLAB, Microsoft SQL Server, Oracle PLSQL, Python.
  • Data Visualization: Tableau, Visualization packages, Microsoft Excel.
  • Machine Learning Algorithms: Classifications, Regression, Clustering, Feature Engineering.
  • Data Modeling: Star Schema, Snow-Flake Schema.
  • Big Data Tools: Hadoop, MapReduce, SQOOP, Pig, Hive, NoSQL, Spark.
  • Databases: Oracle, SQL Server, Teradata.
  • ETL: Informatica, SSIS.
  • Others: Deep Learning, Text Mining, c, Javascript, Shell Scripting, Spark MLLib, SPSS, Cognos.
Responsibilities:
  • Involved in developing analytics solutions based on the Machine Learning platform and demonstrated creative problem-solving approach and strong analytical skills.
  • Interact with the other departments to understand and identify data needs and requirements and work with other members of the IT organization to QlikView-based deliver data visualization and reporting solutions to address those needs.
  • Worked with the Architecture team to get the metadata approved for the new data elements that are added for this project.
  • Data Storyteller, Mining Data from different Data Sources such as SQL Server, Oracle, Cube Database, Web Analytics, Business Object, and Hadoop. Provided AD hoc analysis and reports to executive-level management team.
  • Creating various B2B Predictive and descriptive analytics using R and Tableau.
  • Exploratory analysis and model building to develop predictive insights and visualize, interpret, report findings, and develop strategic uses of data.
  • Utilize Spark, Scala, Hadoop, HBase, Kafka, Spark Streaming, MLLib, R, a broad variety of machine learning methods including classifications, regressions, dimensionally reduction, etc.
  • Designed and provisioned the platform architecture to execute Hadoop and machine learning use cases under Cloud infrastructure.
  • Selection of statistical algorithms – (Two-Class Logistic Regression Boosted Decision Tree, Decision Forest Classifiers, etc.).
  • Used MLlib, Spark’s Machine learning library to build and evaluate different models.
  • Involve in creating Data Lake by extracting customer’s Big Data from various data sources into Hadoop HDFS. This included data from Excel, Flat Files, Oracle, SQL Server, Mongo DB, HBase, Teradata, and also log data from servers.
  • Create high-level ETL design documents and assist ETL developers in the detailed design and development of ETL maps using Informatica.
  • Used R, SQL to create Statistical algorithms involving Multivariate Regression, Linear Regression, Logistic Regression, PCA, Random forest models, Decision trees, Support Vector Machines for estimating the risks of welfare dependency.
  • Helped in migration and conversion of data from the Oracle database, preparing mapping documents and developing partial SQL scripts as required.
  • Generated ad-hoc SQL queries using joins, database connections, and transformation rules to fetch data from legacy Oracle and SQL Server database systems.
  • Worked on predictive and what-if analysis using R from HDFS and successfully loaded files to HDFS and loaded from HDFS to HIVE.
  • Analyze data and predict end customer behaviors and product performance by applying machine learning algorithms using Spark MLlib.
  • Performed data mining on data using very complex SQL queries and discovered patterns and used extensive SQL for data profiling/analysis to provide guidance in building the data model.
  • Create numerous dashboards in tableau desktop based on the data collected from zonal and compass, while blending data from MS-excel and CSV files, with MS SQL server databases.

Environment:
 Python, MongoDB, JavaScript, SQL Server, HDFS, Pig, Hive, Oracle, DB2, Tableau, ETL (Informatica), SQL, T-SQL, Hadoop Framework, Spark SQL, SparkMllib, NLP, SQL, Matlab, HBase, R, Pyspark, Tableau Desktop, Excel, Linux, Informatica MDM.

Confidential, Toronto, CA