This year’s Cassava Leaf detection challenge was a very good one as many more people participated compared to the last year where only 86 teams participated in comparison there were over 3500+ teams that took part this year. I joined this challenge a bit later. I am placed at 1444/3900 teams (top 38%). I missed the medal this time around but am pretty happy on surviving the shapeup which shot me up 425 places up the Leader board. I wish to note down everything that did in this competition and everything I learned this time around. It will be useful…

OCR of the images also included.

Import Required Libraries

import pandas as pd
import numpy as np
import cv2
import seaborn as sns
import sklearn
import matplotlib.pyplot as plt
from textwrap import wrap
import pytesseract
import re,string
from wordcloud import WordCloud, STOPWORDS
from tqdm.notebook import tqdm
from joblib import dump, load
import nltk
from nltk.corpus import stopwords'stopwords')
[nltk_data] Downloading package stopwords to /usr/share/nltk_data...
[nltk_data] Package stopwords is already up-to-date!

path = '../input/shopee-product-matching'
train_path = '../input/shopee-product-matching/train_images'
test_path = '../input/shopee-product-matching/test_images'

Let’s have a look at the data head

data = pd.read_csv(path+'/'+'train.csv')

Basic Details about the data

print(f"The Shape of the train data : {data.shape}") print(f"Duplicate Rows …

Mohneesh S

ML Enthusiast || Problem Solver || Unemployed For Now

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store