Joanna Khek AI Apprentice Singapore

About Me

Joanna LinkedIn

Hi I am Joanna Khek Cuina from Singapore. I'm currently an AI Apprentice at AI Singapore, where I am immersed in the dynamic field of artificial intelligence. I hold a Master's degree in Statistics from National University of Singapore and a Bachelor's degree in Mathematical Sciences from Nanyang Technological University.

My passion lies in leveraging data to solve real-world challenges, and I find great satisfaction in applying statistical insights to drive meaningful outcomes. I am dedicated to staying at the forefront of AI advancements and actively keep myself updated with new technologies and methodologies in the field. I believe in the power of continuous learning and actively seek opportunities to learn from others in the AI and data science community.

Feel free to connect with me on LinkedIn!

Skills

Programming

logo logo logo

Visualization

logo logo logo

Web Application

logo logo logo logo

Tools

logo logo logo logo logo

LLM Frameworks

logo logo

Cloud

logo

Projects

sound-nest

Sound Nest

  • Django, FastAPI, React, NextJS, Redux, LangGraph

Sound Nest is an enhanced version of my earlier project, Rhythmix, featuring several key improvements: Next.js for server-side rendering, significantly improving application speed and performance. Djoser for robust and efficient user authentication. Redux for scalable and centralized state management. TypeScript for improved type safety, code maintainability, and developer experience.

Check it out
rhythmix

Rhythmix

  • Django, React, LangGraph

Rhythmix was initially developed as a song recommender using Streamlit. To explore modern frontend development, I rebuilt the application using React for a dynamic user interface and Django for a robust backend. This project showcases my learning journey with React while enhancing the song recommendation system with natural language processing and vector search capabilities.

Check it out
gov-support-bot

Government Support Bot

  • Langchain, Langgraph, Streamlit, FAISS

This project leverages RAG with an agentic workflow to assist users in finding support schemes that truly align with their needs. Unlike the current implementation on the main website, where users are limited to filtering schemes and often receive irrelevant results, this system ensures more personalized and precise recommendations. Key features include Multi-Query Retrieval, Reciprocal Rank Fusion, Human-in-the-loop and an iterative workflow via agents.

Check it out
noteworthy

NoteWorthy

  • Langchain, LLM

This product was created during the 2 days mini project deep skilling phase at AIAP. NoteWorthy is a tool to help apprentices get insights into fellow apprentices' .ipynb notebooks. It simplifies the process of gaining insights from peer notebooks, making it an essential companion for any apprentice eager to enhance their skills and knowledge.

Check it out
hdb-rag-chatbot

HDB BTO RAG Chatbot

  • Langchain, Streamlit, Chroma

The Housing & Development Board (HDB) is the public housing authority in Singapore. As someone who is awaiting the completion of my BTO, I found myself searching for relevant information on the HDB website. This prompted me to create this chatbot, which is designed to address questions related to Build-to-Order (BTO) flats.

Check it out
rag

Retrieval Augmented Generation

  • Langchain, Chainlit

This project is a simple implementation of Retrieval Augmented Generation (RAG) using Langchain and Chainlit framework. Data Helper is a bot that enables users to inquire about the content of a webpage. The Chainlit framework was employed to create a user interface, allowing users to input their queries.

Check it out
tabi-studios

Tabi Studios Web Application

  • Database, Streamlit

My younger brother has recently embarked on his entrepreneurial journey. As his elder sister, I decided to put my data skills to good use. I developed a Streamlit web application to assist him in streamlining business processing workflow.

Check it out
sentiment-prediction

Sentiment Prediction

  • PyTorch, FastAPI

The objective of this project is to develop a robust sentiment analysis system capable of classifying customer reviews into positive or negative labels on the popular travel platform, TripAdvisor. This project is implemented using PyTorch and uses the state-of-the-art DistilBERT model, which is known for their efficiency and performance in natural language processing tasks. The model is deployed using FastAPI and containerised using Docker.

Check it out
brain-mri-classification

Brain Tumor Detection

  • PyTorch

The goal of this project is to develop a deep learning model for the detection of brain tumors using MRI (Magnetic Resonance Imaging) images. The project uses various deep learning techniques implemented in PyTorch

Check it out
end-to-end-ml-pipeline

End-to-end Machine Learning Pipeline

  • Machine Learning

Over the years, I've explored many different machine learning tools and technologies. To solidify my learning, I've put together an end-to-end machine learning pipeline. I harnessed the power of MLflow and hyperopt for hyperparameter tuning. Incorporating software engineering best practices, I leveraged tools such as tox for efficient virtual environment management and implemented pytests for rigorous unit testing. Furthermore, I ensured reproducibility through Docker and facilitated deployment using FastAPI.

Check it out
food-explorer

Food Explorer App

  • Streamlit Web Application

Food Explorer is a web application built using Streamlit. The app aims aims to provide you with the most delicious recommendations based on reviews, wishlisted counts, and pricing. Users are also able to view the summarised reviews in the form of a wordcloud, as well as images submitted by the reviewers. The app utilised data scraped from Burpple

Check it out
primary-school-webapp

Singapore Primary School Explorer

  • Streamlit Web Application

Choosing a primary school for your child can be a daunting task. What phase am I eligble for? What are the chances of getting into this particular school? What are the schools with this particular CCA? In an attempt to answer all those questions, I have created a web application to assist parents in selecting a primary school for their children.

Check it out
pokemon

Pokemon Statistical Analysis

  • Statistical Analysis

In this mini analysis, I wanted to compare the power statistics of different pokemon types and see if there is a significant difference between them. After checking the relevant assumptions, non-parametric tests such as Kruskal Wallis Test was performed. Post-hoc test such as Dunn's Test was further conducted to compare pairwise group differences.

Check it out
sg-hospital-coverage

Hospital Coverage in Singapore

  • Spatial Analysis Project

One of the key amenity in a neighbourhood is a hospital. For a household with elderly or whose family members have chronic conditions, staying near a hospital would translate into less travelling time or even a matter of life or death in a critical situation. This project analyses the hospital coverage in Singapore so that flat buyers with hospital needs can make a more informed decision.

Check it out
st5188

Detecting structural defects using convolutional autoencoder

  • ST5188 Statistical Research Project

As part of my MSc in Statistics programme at NUS, we had to conduct basic research activities on a topic of interest. While doing my research, I found that most of the papers relating to the application of convolutional autoencoder to detect anomaly is quite recent. As I am currently working in the Building and Construction Authority (BCA) of Singapore, I wanted to study the effectiveness of using convolutional autoencoder to detect cracks.

Check it out
disease_prediction

Disease Prediction Web Application

  • Flask Web Application

DiagnoseIt is a web application to help users predict the possible diseases from their symptoms. More often than not, we spend time googling what is wrong with us when we experience some symptoms. I wanted to create a consolidated site where users can select their symptoms from a list and immediately be well aware of the possible diseases that they might have. Upon identification of the possible diseases, users are provided with description of the disease as well as the precautions to prevent further injury to the body.

Check it out
hdb_resale_prices

HDB Resale Prices Web Application

  • Streamlit Web Application

The goal of this web application is to assist users in finding a suitable HDB resale flat in Singapore. The data is obtained from data.gov.sg and OneMap via their API. Users can find out how much they will need approximately for buying a flat in the respective towns, the relationship of resale prices and remaining lease years in the various towns and flat type, potential units which are categorised into low, medium and high price to aid users in their selection and more!

Check it out
esplanade_music

Free Music @ Esplanade News

  • Telegram Bot Scraper

My partner and I are both musicians and we love attending music performances together. The Esplanade Singapore frequently organises free music performances and we thought it will be great if we can get notifications of any new upcoming performances as soon as possible since some performances require registration and tickets might be snapped up fast!

Check it out
youtube

Youtube Data Analysis

  • Youtube API

In this project, I will be analysing one of my favourite youtube channel BeatEmUps. I have been following this channel for a really long time and I have to say that Wood Hawker (content creater of this channel) is extremely creative. His videos never fail to put a smile on my face! At the end of the analysis, I find out what are his best and worst performing videos.

Check it out
dsa5204

Neural Image Caption Generation

  • DSA5204: Deep Learning and Applications

As part of my graduate module DSA5204: Deep Learning and Applications, we were required to reproduce and extend one recent research paper on deep learning published in a reputable machine learning publication. For my group, we decided to reproduce the "Show, Attend and Tell: Neural Image Caption Generation" paper. Image captioning is the idea of taking an image and then producing a sentence that describes the image. The paper uses concepts such as Encoder-Decoder, Attention Mechanism and Long Short Term Memory.

Check it out
telegram

Foodie Khek Telegram Bot

  • Telegram Bot

What shall we eat today? What good food can I find in this area? Sounds familiar? To solve this problem, I decided to spend my winter break building a simple telegram bot that generate food choices from my preferred location. The list of food choices were scrapped from DanialFoodDiary and EatBook using Python.

Check it out
animal_crossing

Animal Crossing Villager Insights

  • RShiny Web Application

I have been enjoying Animal Crossing New Horizon and stumbled upon a villager popularity list. I wondered about the factors contributing to the popularity of the different villagers and decided to do up a Shiny dashboard to explore. The project involves web scraping from different sources.

Check it out
wids

Diabetes Prediction

  • Data Science Competition

The aim of this competition is to build a model to determine whether a patient admitted to an ICU has been diagnosed with a particular type of diabetes, Diabetes Mellitus. The leaderboard was evaluated on the Area under the Receiver Operating Characteristic (ROC) curve between the predicted and the observed target. I ensembled three models using Catboost and LGBM. My submission was ranked 48/808 in the Private Leaderboard (Top 6%).

Check it out
nuswhipers

NUSWhispers Explorer

  • Flask Web Application

As an avid NUSWhispers reader, I have always wanted to know what are the main ideas behind each topic category, the average number of comments, shares and reactions in each topic category etc. Using python to perform web scraping and flask to develop the web application, I get to easily have my questions answered! The website will be updated every month!

Check it out
spotify

Spotify Explorer

  • Dash Web Application

As a long time Spotify user, I wanted to explore my music taste. This web application allows me to see my all-time and current top artists, tracks, track details. Will be including more features in the future.

Check it out
university_dashboard

University Comparison Dashboard

  • RShiny Web Application

A submission for the Rshiny Compeition 2020, this web application allows you to find out what courses you are eligible for, the historical employment rate, salary and more!

Check it out
banking_ml

Machine Learning for Banking

  • Machine Learning

This hackathon was organised by Analytics Vidhya and it ran from 29 May to 31 May. The goal of this project is to predict the loan rate category using various factors such as employment, loan amount and debt.

Check it out
shopee

Marketing Analytics

  • Data Science Competition

In this Marketing Analytics, the aim is to build a model that can predict whether a user opens the emails sent by Shopee. The dataset contains information about user-specific information, the email nature, user's engagement on the platform and user's reaction to the email. My submission was ranked 54/354 in the Private Leaderboard (Top 16%).

Check it out
janata_hr

HR Analytics Hackathon

  • Predictive Modelling

The hackathon was organised by Analytics Vidhya and it ran from 8 May to 10 May. The aim is to predict the probability of whether an enrollee is likely look for a new job using factors such as gender, education level, experience, company size, company type.

Check it out
shopee_image_class

Product Detection

  • Image Classification

In this product detection competition, a multiple image classification model needs to be built. There are ~100k images within 42 different categories, including essential medical tools like masks, protective suits and thermometers, home & living products like air-conditioner and fashion products like T-shirts, rings, etc. I adopted the fast.ai approach and my submission was ranked 130/646 in the Private Leaderboard (Top 21%).

Check it out
st-bt

The Straits Times & The Business Times Web Scrapper

  • Web Srapper using Selenium in Python

This web scrapper extracts daily news articles and also automatically download articles into PDF format

Check it out
analytics_vidhya

E-Commerce Analytics Hackathon

  • Data Science Compeition

The hackathon was organised by Analytics Vidhya and it ran from 10 April to 12 April. The dataset contained a list of products viewed by the user in the given session and also the category, sub-category, sub-sub category and the product id. The aim of this hackathon is to predict the gender of the e-commerce's participants based on their product viewing records.

Check it out