Data Scientist, Machine Learning Practitioner
I am a passionate learner who enjoys developing new skills and leveraging them to solve challenging
problems. To me, nothing is more exciting than struggling through a problem and finding that
breakthrough solution. While studying Mechatronics Engineering at the University of Waterloo, I
have been presented with countless opportunities to do so. Collaborations with classmates has
introduced me to several interesting areas, including Machine Learning.
Outside of the classroom, I use my time to develop as a Machine Learning practitioner. By
watching video lectures, reading books/articles, and completing several side-projects, I have
grown a valuable knowledge base in the area. During internships, I employ my skill set to tackle
complex and interesting problems.
In my free time I enjoy weight lifting, running, golf, and soccer.
May 2022 - August 2022 Toronto, ON
Employer Evaluation: OUTSTANDINGJanuary 2022 - April 2022 Toronto, ON
Employer Evaluation: OUTSTANDINGMay 2021 - August 2021 Toronto, ON (Remote)
Employer Evaluation: EXCELLENTSeptember 2020 - December 2020 Waterloo, ON (Remote)
Employer Evaluation: EXCELLENTJanuary 2020 - April 2020 Ottawa, ON
Employer Evaluation: EXCELLENTApril 2019 - August 2019 Windsor, ON
Employer Evaluation: EXCELLENT
The goal is to build a Python package providing the necessary functionality for
training Neural Network models from scratch. No TensorFlow or PyTorch. Just
old-fashioned Python and its computing library, NumPy.
To demonstrate a fundamental understanding of Neural Networks and their complex
intricacies, I built custom implementations for many common techniques and
structures. Written is code for trainable parameters, model layers, optimizers,
loss functions, and other needed infrastructure. Using this package, one can
build Neural Networks and achieve convergence on a variety of Machine Learning
problems.
More work will be done to improve convergence during training and to support
other layers, optimizers, loss functions, etc.
The goal of this project is to build a Mini Cart, powered by a
Raspberry Pi,
that can take directions without direct human intervention. A camera is mounted
to the front of the cart chassis. Pictures are taken and processed through a
Machine Learning algorithm to extract instructions.
Instructions include: 'go left', 'go right', 'go forwards', 'stay'. After a proof
of concept is acquired, speed controls will be added. The user simply has to
point in the direction of travel and the Mini Cart responds appropriately.
More information on this project will be posted in the
Featured Work section of this page. To view the
source code, a GitHub link is provided.
When working at the NRC, I was asked to develop a method by which end users could
access final products. Ice presence forecasts needed to be generated daily
and available for download. I pioneered and finished the complete deployment
pipeline of the project.
Using docker, I containerized the application and pushed
the image to AWS. I then leveraged the flexibility of
AWS Fargate;
scheduled tasks run daily and products are automatically uploaded to AWS s3. A
website was created to display products and validation graphs of our models.
Below you can access the website I created to view and download final products.
The goal of this project is to stand the pole upright for as long as possible; a
traditional controls problem, but with the twist of using a Reinforcement Learning
approach. Through the use of OpenAI Gym's environment, I am given control of the cart.
The cart can move either left or right. My task is to determine, given a 4D vector
[Position, Velocity, Angle, Angular Velocity], the action that will maximize the
probability that the pole does not fall over, both in short and long term.
Below you can access my Github Repo and the documentation for OpenAI Gym.
This was a first-year, first-term design project for all Mechatronics Engineering
students at the University of Waterloo. It was an open-ended project. We were given
a lego set and an EV3 controller. Monitoring and defending personal space was the task
we chose.
The final product surveyed an area and fired projectiles at approaching
objects. Chassis composed of lego; source code written in RobotC. During the closing
phase of the project, a final report was produced that summarized project scope,
constraints and criteria, mechanical and software design and methods of testing.
Reinforcement Learning was used to solve this problem. This concept is synonymous to positive/negative reinforcements in real life:
If a dog behaves, they receive a treat; if they misbehave, you put them in their cage. After time, the dog figures out what is considered good/bad behaviour based off of the consequences.
This approach can be used to our advantage when training AI. The reinforcement learning application consists of four entities:
The approach for solving this problem is the following: train the CartPole to take actions that maximize its reward.
For a given state, it is the network's job to output a vector of probabilities for taking each action in the action space. When training, we attempt to minimize the difference between that output and the action that maximizes rewards.
For example, if the CartPole is currently travelling to the left, it is preferable to reverse that movement and begin travelling to the right. Thus, a desired output from the model is [0, 1] ([Going left, Going right]). A network without training won't necessarily generate that output. The optimizer's job is to adjust the network so that the vector approaches [0, 1] when the CartPole is moving left.
We multiply our gradients by the rewards. When performing Back Propagation, positive rewards (agent did something right) will cause the optimizer to descend down the gradient. The opposite is true for negative rewards (agent did something wrong). This process has the effect of 'learning' what the appropriate actions to take are, given a state from the environment.
There is a fundamental problem with how rewards are calculated: how does the agent know if falling at t = 100 was caused by an action at t = 98 or t = 17? This is known as the 'Credit Assignment Problem'.
To solve this problem, we apply a discount rate to our rewards. For each time step:
Discounted Reward [t = t] = Reward [t = t] * pow(discount_rate, 0) + ... + Reward [t = t + n] * pow(discount_rate, n)
Note: discount_rate belongs to (0, 1)
What does this accomplish? This decreases the impact that a future reward has on the current time step. Using discounted rewards, the agent, over many games, is able to 'learn' what actions, given a corresponding input from the environment, are beneficial.
The tf.keras API was leveraged for the formulation of models. Each model architecture developed consisted of a series of fully-connected layers. Dropout was used for the final design; this design performed the best during testing.
The final design, model_v5, consisted of the following:
Dense(32) → Dropout → Dense(32) → Dropout → Dense(32) → Dropout → Dense(32) → Dropout → Dense(32) → Dense(2) → SoftmaxAfter 5 iterations of the model architecture, a policy model was trained that converged and was able to survive in the environment. The program was manually stopped at 15,000 steps (about 5 minutes). It took 900 episodes of training to reach this point. Below is a video showing the agent's progress throughout the training loop.
Three resources were used to help me as I learned about Reinforcement Learning:
Machine Learning was used to solve this problem. Using the tf.keras API, a Convolutional
Neural Network was trained on the MNIST dataset. The following model architecture was
used.
Thanks to Google, I was able to train on a state-of-the-art TPU (Tensor Processing Unit)
through their free cloud service: Google Colab. On the MNIST dataset, I reached a
validation accuracy of approximately 99.2%.
It is important to note that the
numbers of the MNIST dataset are written in pen, while we are using an HTML
canvas to draw our numbers; this could slightly impact model performance.