"No linear indexing system is adequate to express the multi-dimensionality of knowledge" — Neal Stephenson, Quicksilver
Data Science Path & Life Cycle
Data Science ⇒ practice of transforming raw, challenging data into actionable insights; foreseen by Isaac Asimov in his Foundation series
Data Analytics ⇒ application of data science skills in the business world; the science of extracting trends, patterns & useful information from datasets to increse profits, better utilize resources & improve managerial operations
Planning | Discovery & Business Understanding
-
Define goals
-
Organize resources
-
Coordinate people
-
Schedule project
Wrangling | Data Acquisition, Cleaning & Exploration
-
Get data
-
Clean data
-
Explore data
-
Refine data
Modeling | Predictive Modeling, Feature Engineering & Machine Learning
-
Create model
-
Validate model
-
Evaluate model
-
Refine model
Applying | Data Visualization, Reporting & Decision Making
-
Present model
-
Deploy model
-
Revise model
-
Archive assets
Data Science as Business Intelligence
The application of the Scientific Method to make decisions rather than intuition, anecdotal evidence, gut instinct, executive opinions or coin flip
Scientific Method
-
Ask a question
-
Form a hypothesis
-
Design an experiment
-
Collect data
-
Analyze the data
-
Draw a conclusion
-
Take action
Justification
-
Evidence-based
-
Objectively measurable
-
Reproducible
-
Transparent
Data Analytics Mathematics
Regression Analysis
An understandable equation to predict a single outcome based on multiple predictor variables
-
Multiple Linear Regression
-
Linear Regression with Linear Algebra
-
Matrix-Vector Multiplication Function
↳ where:
-
Data Acquisition
Structured Query Language (SQL)
-
Basic Anatomy & Physiology of SELECT Query
-
Operation: Describes what is going to be done with SELECT keyword followed by the names of columns to combine with functions.
-
Data: FROM keyword followed by one ore more tables connected together with reserved keywords, indicating what data should be scanned for filtering, selection & calculation.
-
Conditional: Filters the data to only rows that meet a condition, usually indicated by WHERE keyword.
-
Grouping: Special clause that takes rows of a data source, assembles them together using a key created by a GROUP BY clause, then calculates a value using the values from all rows with the same value.
-
Post-processing: Takes the results of the data & formats them by sorting & limiting the data using keywords such as ORDER BY & LIMIT.
-
Brilliant Tutorial Resources
-
A Curious Moon from Big Machine
-
Rockets. Moons. PostgreSQL. What else do you need?!
-
About Ryan L Buchanan
I am re-skilling as a Data Analyst & Machine Learning Engineer. I
am currently enrolled in a Masters in Data Analytics. I am also
acquiring certifications as an ML Engineer & Algorithmic Trader from Udacity.
I have an MBA & an MS in Instructional Design.
I have a multi-displinary background including military intelligence,
psychology, linguistics, economics, virtual reality & educational technology. I have
worked abroad for ten years with military, universities & vocational schools.
I have working knowledge of Arabic, Chinese & French. I am very
mobile, able to relocate quickly, adapt easily to diverse working conditions
& have a current passport.
I have a passion for mathematics, statistics & artificial intelligence.
I am enthusiastic, highly self-motivated & enjoy presenting informative data
to decision makers. I am eager to work with dynamic teams to create
high quality products & services.