Kohei Suzuki

Projects

Aug 2020 - Current

DeepColors app for iOS Website

About this project

A mobile app that helps users colorize sketches they have drawn and grayscale photos, by using the power of Deep Learning.

Demos

- Colorize sketches

- Colorize grayscale photos

Application outline

Took about 3 months from learning how to display "Hello world" in a View in SwiftUI.

Client Side

Machine Learning

Server Side

Monetisation

Jul 2020 - Aug 2020

Speech-To-Text app with Flask [github]

About this project

A Speech-To-Text app with Flask in which we can upload a video or an audio file and can get transcripts of the speech in the file we upload.

How it works

Once we upload a video file, it takes the audio from the video with the information of the file such as the sampling rate by using ffmpeg-python, which is a wrapper of ffmpeg. Based on the information, it converts the audio to a 1-D Numpy array which is fed into the DeepSpeech model which trained by machine learning techniques based on Baidu's Deep Speech research paper. The output from the DeepSpeech model is then fed into a language model in order to improve the prediction accuracy.

Application outline

Responsive image

After a video or an audio file is uploaded, it collects information about the file such as sampling rate, and takes audio from the video if the file is a video by using ffmpeg-python. The DeepSpeech model desires 16,000 Hz for sampling rate so we need to resample the audio to satisfy that if needed. Then it converts the audio to 1-D Numpy array which is fed into the DeepSpeech model.

Responsive image

The output from the DeepSpeech model is then fed into a language model to improve the prediction accuracy.
Once it gets the output from the language model, it creates a JSON that includes a list of words with start-time and duration. Based on the JSON file, it also creates a text file that keeps a sentence of words concatenated with white space. A zip file that contains the JSON file and the text file is downloadable.

Further work

Improve UI.
Add another feature that detects specific motions of the user and put marks on the sequence of frames so that the user will be able to find easily where they want to cut.
Deploy as a C++ software since I want to create a software with C++.

Skills used

Python:
Others:

Jul 2020 - Jul 2020

Twitter Real Time Financial Sentiment Analysis [github]

About this project

A Flask application where we can enter hashtags and keywords related to tweets we want to stream and in which an NLP model, FinBERT which is a pre-trained NLP model to analyze the sentiment of the financial text, does sentiment analysis on the tweets in real-time. We can see the results of the tweets collected containing the hashtags or keywords and their sentiment scores given by FinBERT via Pandas dataFrame.

FINBERT

It is built by further training the BERT language model in the finance domain, using a large financial corpus, Financial Phrase Bank from Malo et al. (2014) which can be downloaded from here , and thereby fine-tuning it for financial sentiment classification. For the details, please see FinBERT: Financial Sentiment Analysis with Pre-trained Language Models.

Application outline

Responsive image

It starts by entering hashtags and keywords we want to stream on such as $TSLA, $GOOGL, #CAD, #USDJPY, and so on.
Since we use the FinBERT NLP model to get sentiment scores so the hashtags and keywords should be something that may related to finance.
The below image is taken after some keywords are given. Now, ready to start streaming by using Tweepy, which is an API to deal with twitter functionality in Python.

Responsive image

After streaming started, each tweet collected including at least one keyword we defined is preprocessed with NLTK to be given a sentiment score by FinBERT. Once sentiment score is calculated we store information of each tweet into a csv file and display them in a Pandas DataFrame, which we can see the below as an example.

Responsive image

Yes, we can obviously use something like DynamoDB on AWS by using boto3 library to comminucate with in order to store the data instead of a csv file locally.

Further work

Some of the functionality used in this application can also be used in the Automated Forex Trading Strategy which I have been working on in order to create new features.
We change the Deep Learning model which gives sentiment scores to tweets to another NLP model which trained on different dataset, if we want to switch the domain we want to use for.

Skills used

Python:
Others:

Jun 2020 - Jul 2020

Sudoku Solver on Quantum Computers [github]

About the Sudoku solver

This is a Flask app in which it detects Sudoku puzzles that we show to the webcam by using OpenCV and a CNN model which is for recognizing each digit.
Once it detected a puzzle, then we formulate a Quadratic Binary Model (QBM) and an objective function that we want to minimize in order to find the solution by using D-Wave's quantum computers.

The purpose of this project:

Learn the annealing way quantum computers which are good at solving particular problems such as optimization problems.
Get hands dirty with OpenCV.
Learn testing and how to use CI tools such as CircleCI.
Learn how to use Docker.

Preprocessing steps

Here, the workflow from capturing a sudoku puzzle to findind a solution will be descrived.
I will go with images from left to right and top to bottom so you will can easily imagine what is going on the inside.

1. Convert a frame to gray scale.	2. Applies an adaptive threshold to an array.	3. Blurs an image using the median filter.
4. Detect the puzzle.	5. Create a mask.	6. Capture the grid.
7. Detect the vertical lines.	8. Detect the horizontal lines.	9. Calculating the points where the vertical lines and horizontal lines cross.

The samples of digits cropped and fed into a CNN model.

Then we create a 2D Numpy array which represents the sudoku puzzle that the user showed and which is created based on recognition of a CNN model, which is trained on Chars74K, for each cell in the grid.

The last thing the user has to do is that fix the numbers that are misclassified by the CNN model by filling the corresponding text box with a correct one.

Formulate our problem for D-Wave's quantum computers.

We need to formulate problems we want to solve as a Binary Quadratic Model (BQM).
In order to solve a BQM, we need to define an objective function which would be Quadratic Unstractured Binary Optimization (QUBO) or Ising. By finding the values that minimize the objective function, we solve the BQM.
A BQM equation has two parts: Objective: What we are trying to minimize Constraints: Rules we need to satisfy

BQM Development Process

Convert our objective and constraints to math statements with binary variables if we picked QUBO as an objective function or -1/+1 variables if Ising.
Make our objective and constraints "QUBO appropreate".
- Objective is a minimizing function
- Constraints are satisfied at thier minimum values

Binary Quadratic Model

The coefficients $a_{i}$ and $b_{i, j}$ are constant numbers we choose to define our problem, as is the constant term $c$ .
The binary variables $v_{i}$ and $v_{j}$ are the values that we are looking for to solve our problem.
The best solution for these variables is the value for each $v_{i}$ that produces the smallest value for the overall expression.
Searching for the variables that minimize an expression is called an “argmin” in mathematics.

Linear Terms:
The first summation, $\sum a_{i}v_{i}$ , contains linear terms, with each having just one binary variable.
Quadratic Terms:
The second summation, $\sum \sum b_{i, j}v_{i}v_{j}$ , contains quadratic terms, with each term in the summation containing a product of two variables.
Constants:
In the general BQM form we may or may not include constants. Since we are looking for an argmin, any constant terms will not affect our final answer. However, it may be useful when interpreting the output from D-Wave solvers and samplers.

For more information about BQM, QUBO, and Ising, please visit here.

Note

Though we use the D-Wave quantum computers to solve Sudoku puzzles which has a fascinating speed for the computation, many people access the computers so that we have to wait for a queue for using them. So the total process will take time but usually, it is done within a minute.
Sometimes the quantum computers can not find the solution for the given Sudoku puzzle especially for difficult ones since quantum computers run the calculation several times and pick up the best solution they found. In other words, it did not converge to an optimum of the objective function.

Jan 2020 - Present

Forex Trading System with Deep Reinforcement Learning

Overview

This is an automated forex trading strategy by using the power of reinforcement learning which is a type of machine learning. The goal is to optimize a forex trading strategy and to make a profit with it on the real financial market while I am sleeping.

About this project can be separated into two sections, MVP and version 2.0.

In MVP section:
In version 2.0 section:

MVP

First, I will list up key parts in MVP that I finished almost within a month and then I will add explanation to each of them.

SureFireStrategy
Gramian Angular Field
Data
Result
Deployment

1. SureFireStrategy

I loosely followed this paper, Deep Reinforcement Learning for Foreign Exchange Trading.

In this paper, what they tried was that tried to optimize SureFireStrategy which is a variant of the Martingale by using ConvNet as the agent in reinforcement learning in order to find patters in heatmap images encoded from time series data by Gramian Angular Field (GAF) which I will talk about later.

The Sure-Fire starategy

First, as illustrated in Fig. 2, we purchase one unit at any price and set a stop-gain price of +k and a stop-loss price of −2k. At the same time, we select a price with a difference of −k to the buy price and +k to the stop-loss price and set a backhand limit order for three units. Backhand refers to engaging in the opposite behavior. The backhand of buying is selling and the backhand of selling is buying. A limit order refers to the automatic acquisition of corresponding units.
As illustrated in Fig. 3, when a limit order is triggered, and three units are successfully sold backhand, we place an additional backhand limit order, where the buy price is +k to the sell price and −k to the stop-loss price. We set the stopgain point as the difference of +k and the stop-loss point as the difference of −2k, after which an additional six units are bought.
As illustrated in Fig. 4, the limit order is triggered in the third transaction. The final price exceeded the stop-gain price of the first transaction, the stop-loss price of the second transaction, and the stop-gain price of the third transaction. In this instance, the transaction is complete. The calculation in the right block shows that the profit is +1k.

Forex Trading System with Deep Reinforcement Learning

2. Gramian Angular Field (GAF)

On the left side, it shows price movement in 5 minutes time frame with 12 window size. On the right image, it is an image encoded by GAF which represents the price movement on the left side and is a sample of images that were fed into ConvNet and that were defined as the states in reinforcement learning. Each image had 4 channels that corresponded to Open, High, Low, and Close in a timeframe.

3. Data

How I got forex data was that I used a python API provided by OANDA which is a broker that I use. I was able to gather data in any major timeframe I wanted.

4. Result

The below image is a result plot of training. From this plot what I could say were:

Obviously, the model had not been trained well
Exploration and exploitation problem
SureFireStrategy might not fit
Data quality might be not good enough

How I fixed those problems is mentioned in the version 2.0 section. Responsive image

5. Deployment

Though I had not got any model that might be able to make a profit on the real market, I deployed the model on AWS EC2 and made all the process needed automated by defining operations in a bash script. I also set up CloudWatch to turn on and off the server not to waste money on weekends when the forex market close.

Version 2.0

I have been working on the version 2.0 and its differences from the MVP are,

Trading strategy
Definition of the state
Data
Result
Further work

1. Trading strategy

I can see the strength of the SureFire Strategy only when I can bet double continuously over and over again. Due to my bankroll size, I was not able to place order like that. So I set a certain pip size to exit the market instead of using the SureFireStrategy.

2. Definition of the state

In MVP, I used encoded heatmap images as a state but this might cause the result that the model had not been trained well, meaning that ConvNet could not find any patters in the images. So I switched the way to define the state to use technical indicators as features that describe the state.

3. Data

In MVP, I used data that OANDA provides, but the data actually had a considerable amount of nans that I filled up. And the paper I followed did training on data that had timeframe instead of using bid-ask price so that I could not perfectly reproduce the actual price movement that is happening in the real market. So I started collecting bid-ask data in real-time that is used to train models in version 2.0.

4.Result of backtesting

All the entry points for short from 2020/May/18 to 2020/May/23 (green: made profit, red: loss) Responsive image

All the entry points for long from 2020/May/18 to 2020/May/23 (green: made profit, red: loss) Responsive image

All the entry points for short from 2020/May/25 to 2020/May/30 (green: made profit, red: loss) Responsive image

All the entry points for long from 2020/May/25 to 2020/May/30 (green: made profit, red: loss) Responsive image

	2020/May/18 - 2020/May/23	2020/May/25 - 2020/May/30
Profit
Number of trading	117	105
Number of winning	86	79
Number of losing	31	26
Winning ratio	0.735	0.752
Profit Factor	1.515	1.820
Max DrawDown	-40pips	-50pips
Net Profit	217.7pips	225.0pips

Further work

Solve overfitting:
Use heatmap images as extra features.
Use Fourier transform to approximate the price movement and calculate derivatives that are used as features that may be thought of the strength of the current trend.
Implement algorithmic trading strategies and use their outputs as features with one-hot encoding.
Update the reward function which is one of the crucial parts in reinforcement learning.
Hyperparameter tuning and feature section.

Aug 2019 - Sep 2019

Ticket-Dodger [link]

This is the final team project in Machine Learning Bootcamp at 7 Gate Academy and is an application predicting the likelihood of getting a parking ticket in the Vancouver area based on the user's geolocation and the time. When a user taps a location at where he is planning to park his car or at where he is currently parking his car, that is going to be a trigger to call AWS Lambda where our machine learning model runs to predict the likelihood.

Here is how I and Paul had created this application within a month. Responsive image
We found dataset on Vancouver open data catalog, the original dataset had the information of parking tickets issued such as date time, address including block, infraction, status, etc. However the dataset obviously did not have any target variable that we could use in our case the likelihood or probability of getting a parking ticket. I will explain how we solve this problem in Obstacles section below but the simple answer is that we created by using traffic counts on each street.
We estimated the probability for each street and thresholded them to create three categories, Low, Medium, and High that were the likelihood we were predicting. So we dealt with this problem as a classification problem because it was more user-friendly than giving users a probability.

EDA

While we were working on feature engineering we found that the time was definetely a factor. As you can see below, there is high chances for getting a ticket around 3 PM.

Training Machine Learning Models

As I mentioned, this was a classification problem so we started from training a simple logistic regresssion because it was easy to implement.
Afterwords, we trained different kind of models such as Random Forest, XGBoost, and Neural Networks. At first we made sure that there was a capacity for models to learn something from our data by trying them to overfit on the training data.
Then we started iteratively building more complecated models by changing, for instance in Multi Layer Perceptron (MLP), changing the number of neurons in each layer, the number of layers, optimizers, and so on.
Here is one of the results we got from MLP and XGBoost after Hyperparameter search by using Hyperas and Optuna that are framework in Python for Hyper parameter search.

Model Evaluation

Subjectively evaluating our models was difficult. The best that we could say was that we did a pretty good job of determining the low risk of getting a ticket. It is much more important for us to have accurate LOW risks. For example, if you park expecting a low risk and you end up getting a ticket, it will be a much worse user experience than if you went in expecting a ticket and got none!

We did chase down a parking ticket enforcer and asked for his opinion and he gave us some streets that are common of getting a parking ticket. Our predictions from XGBoost were pretty good. Due to model's performance and inference time, 28.19 [ms], we choose XGBoost model.

Application Archtecture

Backend

Server:
Custom built location to street matching engine
Model:
Deployment: DigitalOcean Droplet
- 1 vCPU
- 1 GB RAM
- 24 GB SSD

Client Side

Website:
Map:
Deployment:

How did we work as a team

Since we lived a little bit far to work together in person, it was important that we had a good system to work together.
We started by working together by sourcing our data, evaluating what we have and creating a merged base dataset.
In order to streamline our approach, we then split up our roles to focus on primary areas, building machine learning models was my focus and Paul was working on development.
Afterwards, we did a knowledge transfer to fill each other on the gaps that we might have missed out on.

We did loosely work in the agile way, changing things as we needed. We made sure we reviewed each others work to the standards that we set out for ourselves. To do so we used Trello to manage our tasks. Here are the some of the tags we had in our channel on Trello.

Product backlog
Current sprint
Doing
Review
Blocked
Done

We also had a daily meeting to catch up what everyone had done.

Obstacles

Target variable creation As I mentioned above, we did not have a target variable, the probability or the likelihood of getting a parking ticket. We created one by using three datasets, one that contained the information of parking tickets issued, second that had the traffic counts on each street including some private streets, and third that had almost all of the street name in the Vancouver area.
It was important for us to define what we mean by “Risk".
It was a fairly arbitrary term. We had decided to use the number of tickets given, divided by the amount of traffic on the street. In this way, we defined risk RELATIVE to the risk of other streets. The formula for estimating the probability for each block on each street was as follows:

In order to do the calculation, we needed to make sure that each street in the parking ticket dataset and traffic counts dataset were the same format to marge the two datasets with the streets as the key.
Here is an example of a street we needed to clean up: "WEST GEORGIA" and "GEORGIA W"
So we used a Python framework, fuzzywazzy, to clean up the streets name.

Education

Dec 2019 - Mar 2020

The university of Tokyo

Data Science Cetificate [link] [See certificate]

Passed coding examinations to take this course. I am one of the about 400 students who have successfully finished this course out of 900, the number of students who had started the course.

Curriculum

1st:
Introduction to Data Science and Python
2nd:
Numpy and Pandas
3rd:
Visualization
4th:
Probability and Statistics
5th:
Supervised Learning
6th:
unsupervised Learning
7th:
Model evaluation and Hyperparameter tuning
8th:
Final Project

Aug 2019 - Sep 2019

7 Gate Academy

Machine Learning Bootcamp [link]

Out of about 120 applicants, I was selected as 7 people who can take the course.

Curriculum
8 weeks, 3.5 hours * 4 days/week.

1st week:
2nd week:
3rd week:
4th week:
5th week:
6th week:
7th week:
8th week:

Ticket-Dodger

Jul 2017 - Sep 2019

Institute of Technology Development of Canada

Computer Science Diploma [link] [See transcription]

The course was 2 years diploma in Open Source Programming which contains one year in class and the secound year for Coop program. I worked as a Machine Learning Developer at Singular Software Inc.

Apr 2018 - Jun 2018

Brain Station Vancouver

Data Science Bootcamp [link] [See certificate]

Curriculum

UNIT 1 Python Programming
UNIT 2 Working with Data
UNIT 3 Data Visualization
UNIT 4 Numerical Models
UNIT 5 Classification Models
UNIT 6 Model Validation
UNIT 7 Machine Learning
UNIT 8 Presenting Data

Apr 2012 - Apr 2013

Tokyo City University

Bachelor Degree in Computer Science [link]

Matriculated

ACHIEVEMENTS

Certificate

Mar 2020 - No Expiration Date
- Reinforcement Learning Specialization
Organization: Coursera

Dec 2019 - No Expiration Date
- TensorFlow in Practice Specialization
Organization: Coursera

Dec 2019 - No Expiration Date
- Sequences, Time Series and Prediction
Organization: Coursera

Dec 2019 - No Expiration Date
- Natural Language Processing in TensorFlow
Organization: Coursera

Nov 2019 - No Expiration Date
- Convolutional Neural Networks in TensorFlow
Organization: Coursera

Nov 2019 - No Expiration Date
- Introduction to TensorFlow for Artificial Intelligence, Machine Learning, and Deep Learning
Organization: Coursera

Nov 2018 - No Expiration Date
- Databases and SQL for Data Science
Organization: Coursera

Oct 2018 - No Expiration Date
- Deep Learning Specialization
Organization: Coursera

Oct 2018 - No Expiration Date
- Sequence Models
Organization: Coursera

Sep 2018 - No Expiration Date
- Convolutional Neural Networks
Organization: Coursera

Sep 2018 - No Expiration Date
- Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization
Organization: Coursera

Sep 2018 - No Expiration Date
- Neural Networks and Deep Learning
Organization: Coursera

Sep 2018 - No Expiration Date
- Structuring Machine Learning Projects
Organization: Coursera

Sep 2018 - No Expiration Date
- Machine Learning
Organization: Coursera

Kohei Suzuki

email address: kohei.suzuki808@gmail.com | Download CV

Main skills

Experience

Machine Learning Developer Internship

About the company

Responsibilities

Skills used

Projects

DeepColors app for iOS Website

About this project

Demos

Application outline

Speech-To-Text app with Flask [github]

About this project

How it works

Application outline

Further work

Skills used

Twitter Real Time Financial Sentiment Analysis [github]

About this project

FINBERT

Application outline

Further work

Skills used

Sudoku Solver on Quantum Computers [github]

About the Sudoku solver

Preprocessing steps

Formulate our problem for D-Wave's quantum computers.

BQM Development Process

Note

Forex Trading System with Deep Reinforcement Learning

Overview

MVP

1. SureFireStrategy

The Sure-Fire starategy

2. Gramian Angular Field (GAF)

3. Data

4. Result

5. Deployment

Version 2.0

1. Trading strategy

2. Definition of the state

3. Data

4.Result of backtesting

Further work

Ticket-Dodger [link]

EDA

Training Machine Learning Models

Model Evaluation

Application Archtecture

How did we work as a team

Obstacles

Education

The university of Tokyo

7 Gate Academy

Institute of Technology Development of Canada

Brain Station Vancouver

Tokyo City University

ACHIEVEMENTS