Data collection is the first step in developing any machine learning model. The source and type of data play an important role in determining whether the upcoming steps will be easy or complex. If we acquire data with less missing values, with only the most important features which determine the output and less ambiguous data, it would be easy for us to process it in the further steps.

In data cleaning, we check for any missing values, duplicate rows, features that don't have a high influence on the output and other unwanted values. The relations between the remaining input and output features are studied through plots and graphs. We remove these unnecessary data and make our dataset clean.

Preprocessing of data means to find a suitable measure to bring the input features into a common standard like mean, median, mode etc. The categorical features if any should be encoded to numerical values.

After the data is cleaned and processed, now we split the dataset into train data and test data. The training data is used to train the model while the test data is used to test how much our model performs and calculate its accuracy scores. In most cases, this ratio is taken as 70:30 or 80:20. It all depends on the use case.

Here in this step, the test data is fed into the already trained model. The efficiency of our model is determined through the scores we obtain at the output. There are various parameters like accuracy, F1-score, r2 score etc. which determines how much accurate is our model.

This is the final stage in a Machine Learning project. Once our model meets the standard set by us, it is time to deploy them. There are various platforms like Heroku, Netlify etc. which may be chosen as per the requirement of our project.

So that is it for this post. Hope you learnt something new out of it. Thank you for reading.ðŸ˜Š

]]>The any() function takes an iterable as an argument : ** any(iterable) **.

The iterable can be a list, tuple or dictionary.

The any() function returns 'True' if any element in the iterable is true. However, it returns 'False' if the iterable passed to the function is empty.

This function is similar to the code block below

```
def any(iterable):
for element in iterable:
if element:
return True
return False
```

Below is an example of using any for returning 'True' for numbers greater than 3. Here we have used list comprehension to keep the code simple.

```
list=[2,3,4,5,6,7]
print(any([num>3 for num in list]))
```

The output is 'True' as 4,5,6 and 7 are greater than 3.

Next, we look into the all() function.

The all() function also takes an iterable as an argument: ** all(iterable) **.

all() function returns 'True' only if all the items in the iterable are true.

Even if one item is false, it returns 'False'. However, if the iterable is empty, it returns 'True'.

all() function is similar to the code block below

```
def all(iterable):
for element in iterable:
if not element:
return False
return True
```

Below is an example of using any for returning True for numbers greater than 3.

```
list=[1,2,3,3]
print(all([num>3 for num in list]))
```

The output is False as no number is greater than 3 in the provided list.

*In dictionaries, both all() and any() functions to check the key for returning True or False and not the values.*

It is a linear algorithm with a non-linear transform at the output. The input values are combined linearly using weights to predict the output value. The output value which is being modelled is a binary value rather than a numeric value.

Suppose we have the results of a set of students, where the criteria for passing is that the student should score 50% or more. Else the student is classified as failed.

We can classify this problem statement using linear regression. But if our data contains some outliers in the test data, it will affect the orientation of the best fit line.

NOTE:Outliers are the points that are apart from the usual group of points. These points can have a strong influence on the orientation of the best fit line.

So we go on with Logistic regression for such use cases.

A sigmoid function is a function which limits the value of the output in the range between 0 and 1. The graph and the equation of the sigmoid function are given as follows.

The value x in the equation is the cost function we calculate using the output and weights. The sigmoid function makes sure that the output value is classified in between 0 and 1 for any large value of x.

The cost function in the case of logistic regression is given by

It should be maximum for correct classification to happen.

- Suppose passing students are considered as positive output & failed students as negative output. Also, if the points classified above the classifying plane is taken as positive and those below the line are taken as negative. i.e., the value of weight Wi.
- The cost function is determined by the product of these two and hence if the point is correctly classified, the cost function will be positive and if it is incorrectly classified, the cost function will become negative.
- So we have to make sure that we have considered the desired weight value, to make sure that the points are correctly classified.
- Hence by looking at the cost function, we can say that whether a particular entry is correctly classified or not.

Hope you found this post useful. ðŸ˜ŠðŸ˜Š

You can find me on Twitter

]]>

NOTE:These are just for educational purposes. Never use these in real-life applications.

Bogosort first checks whether the list is already sorted. It checks over the entire list by checking if the adjacent pair is correctly ordered. If it is not in order, then it shuffles all the list in random order and again starts over. This process goes on till the list is sorted.

When we get into the time complexity of this algorithm, we get to know why it is an inefficient algorithm.

The logic of Bogsort is as follows:

```
import random
from random import shuffle
def sequence_sorted(data):
return (data[i]<=data[i+1] for i in range (len(data)-1))
def bogosort(data):
while not sequence_sorted(data):
shuffle(data)
return data
```

The best-case time complexity is achieved when the list or sequence is already sorted. Best case complexity is **O(n)**.

The average case complexity is **O(n*n!)=O((n+1)!)**.

The worst case is when the list takes forever to get sorted. We have no guarantee that this algorithm will return a sorted output as it randomly shuffles all the entries each time.
The worst-case time complexity is **O()**.

We also have an algorithm called Bogobogosort, which checks the first 2 elements of the list and bogosorts them. In the next round, it checks the first 3 elements and bogosorts them and this process continues till it reaches the end. If it finds that the list is not in order, it starts the process again by bogosorting the first 2 elements.

The average time complexity of this algorithm is **O(N! 1! 2! 3! .... * N!)**.

** Hope you learnt some new concepts. Thank you for reading.ðŸ˜Š **

In this post, we will be discussing padding in Convolutional Neural Networks. Padding is the number of pixels that are added to an input image. Padding allows more space for the filter to cover the image and it also helps in improving the accuracy of image analysis.

Broadly classified, there are two types of padding. They are valid padding and same padding.

It implies no padding at all. That is input image is fed into the filter as it is. So if we consider the input of the order (**n**), a filter of order (**f**) and take stride=1, we get the output image of order (**n-f+1**).

We can notice here the order of output image decreases. Hence we can clearly state that some information is lost as we traverse from input to the output. The example provided above is only for one convolutional layer. But in deep neural networks, there is more than one convolutional layer. Hence this obtained output image when passed through the filter in further steps, will result in further shrinkage in size.

In the case of the same padding, we add padding layers say 'p' to the input image in such a way that the output has the same number of pixels as the input. So in simple terms, we are adding pixels to the input, to get the same number of pixels at the output as the original input.

So if padding value is '0', the pixels added to be input will be '0'. If the padding value equals '1', pixel border of '1' unit will be added to the input image and so on for higher padding values.

So if we consider the input of the order (**n**), a filter of order (**f**) and take stride=1, we get the output image of order (**n+2p-f+1**), if the padding layers added is equal to 1. We can either add zeros to the padding layer or the adjacent entry. The more commonly used method is to add zeros to the padding layer as shown below.

** Hope you understood the concept. Thank you for reading.**

A grayscale image where the image is represented as only the shades of grey. The intensity of the various pixels of the image is denoted using the values from 0 to 255. i.e., from black to white in terms of an 8-bit integer. It uses only one channel.

Coloured images are constructed by combining red, green and blue (RGB) colours in variable proportions. These 3 colours and hence they are called the primary colours. The colour image pixels contain three channels: The R channel, G channel and the B channel, each having its own intensity values ranging from 0 to 255.

Convolution is the process of multiplying each pixel with the corresponding pixel value of the filter and then adding all of the products to get the result. These combinations of result give the output image representation.

Now let us look at an example of convolution.

We pass a 6x6 input through a filter (Here we are using a vertical filter). We get a 4x4 output.

Now let us look at how each of the entries in the output is obtained.

We place the filter on top of the input starting from the top left corner till we reach the bottom right corner. Then we perform the process of convolution (multiply the corresponding entries and add them together). The obtained result is the corresponding output entry. Here we take stride value as 1. That is we jump 1 step to the right after each calculation. When we reach the column end, we jump 1 row below. This process goes on till we reach the bottom right corner.

**The Convolution operation:** The part of the input to be convolved with the filter in each step is highlighted.

**The 1st output entry:**
*1(2)+1(0)+1(-1)+1(1)+1(0)+1(-2)+1(2)+1(0)+1(-1)=2-1+1-2+2-1=1*

**The 2nd output entry:**
*1(2)+1(0)+0+1(1)+1(0)+0+1(2)+1(0)+0=2+0+0+1+0+0+2+0+0=5*

**The 3rd output entry:**
*1(2)+1(1)+1(2)=2+1+2=5*

**The 4th output entry:**
*1(-1)+1(-2)+1(-1)=-1-2-1=-4*

*By performing similar calculations;***The 5th output entry=** *1*

**The 6th output entry=** *5*

**The 7th output entry=** *5*

**The 8th output entry=** *-4*

**The 9th output entry=** *1*

**The 10th output entry=** *5*

**The 11th output entry=** *5*

**The 12th output entry=** *-4*

**The 13th output entry=** *1*

**The 14th output entry=** *5*

**The 15th output entry=** *5*

**The 16th output entry=** *-4*

The output we obtained here is of the order 4 while we have given the input of order 6. Hence we can say that some information loss occurs here.

To prevent this loss of information, we use the padding technique, which will be discussed in upcoming posts.

Thank you for reading. If you liked this post please consider sharing.

]]>I have listed out some YouTube channels, websites and books which are beginner-friendly and which gives us a strong foundation over the basics.

**Youtube Channels**

**Corey Schafer**

This channel is unarguably one of the best resources for Python. Corey Schafer explains the core concepts of Python which can be easily grasped by the beginners.

ðŸ‘‰Corey Schafer

**Python Basics**

This channel teaches about the various features of Python language in a simple manner.

ðŸ‘‰Python Basics

**Python Tricks**

This channel mainly focuses on the shortcuts and tricks in Python language by Bhavesh Bhatt.

ðŸ‘‰Python Tricks

**freeCodeCamp Youtube Channel**

freeCodeCamp helps a large number of people worldwide get access to programming for free. It has a large community and is one of the best places to start off.

ðŸ‘‰freeCodeCamp.org

**Python Programmer**

Python Programmer channel by Giles includes a tutorial on basic as well as advanced topics of Python. He also shares about best books for Python and its applications.

**Programming with Mosh**

Programming with Mosh Youtube channel covers tutorial on various tech topics. This channel's Python tutorial is also considered as one of the best for beginners.

** Websites **

**Real Python**

This site offers Python tutorials for developers of all levels, in the form of articles, books, courses, videos, podcasts etc.

ðŸ‘‰Python Tutorials - Real Python

**freeCodeCamp**

freeCodeCamp may be familiar to most of you. They have a systematic curriculum. In addition, they have an awesome community, where you can ask your doubts if you are stuck. They also have their Youtube channel which is already mentioned above.

ðŸ‘‰freeCodeCamp.org

**Python documentation**

Never underestimate the knowledge that you can get from the documentation. They are the best resources I would say. Except for a few languages, documentation are well-written. Python also has one of the best documentation. So if you are beginning out or stuck at some point, never hesitate to refer to Python documentation.

**Books**

**Python for everybody**

This book by Dr Charles R. Severance helps one to understand why programming is important. It also is widely used by beginners to learn Python.

**Think Python**

Think Python by Allen B. Downey is an amazing book for Python beginners. It uses simple language along with code snippets so that the readers can practice along with reading.

ðŸ‘‰Think Python

**Automate the boring stuff with Python**

This book teaches its readers Python in a different way. How python can be used to automate various small tasks and various other things are discussed in this book.

ðŸ‘‰Automate the boring stuff with Python

Hope you learnt something useful in this article. If you have any good resources for beginners in Python, please share in the comments.

Credits: Cover image from Unsplash

]]>In Machine Learning, it is used to update the coefficients of our model. It also ensures that the predicted output value is close to the expected output

For example, in Linear Regression, where we separate the output using a linear equation.

Let us say we have an equation **y=mx+c**.

Here, **m** stands for **slope** and **c** for the **y-intercept**. These two values can be optimized and the error in the cost function (the difference between expected and predicted output) can be reduced using the gradient descent algorithm.

So let us see how weight updation works in gradient descent. Let us consider a graph of cost function vs weight. For improving our model, bringing down the value of cost function is essential. We consider the lowest point in the graph as the winner since the cost function would be minimal at this point.

In the above diagram, we can see that with each iteration, the function tries to bring down the cost value. But that is not the case in real-world datasets. In real-world cases, it moves in a zig-zag manner for most of the datasets. The graph for real-world cases is as shown below.

The weight updation takes place by decrementing the cost function in steps of the gradient (derivative) of weight function. The equation used for weight updation is:

Here corresponds the weight and the learning rate, which determines by what value the descent is made in each iteration.

For every cross mark shown in the graph, we calculate the slope. According to the slope value, we update the weights in the above equation.

Here is a sequence diagram to brief out the process of updating the weights. This update in weights leads to a reduction in the cost function.

Cover generated by an awesome tool called CoverView, built by Rutik Wankhade.

Hope you learnt something from this article. Thank you for taking the time to read this article.

]]>Let us first look at the *confusion matrix*.

The *Accuracy* is calculated as:-

The *False Positive Rate* is calculated as given:-

Here;

TP:- True Positive TN:- True Negative FP:- False Positive FN:- False Negative

Consider a dataset with 1000 records of people who smoke and who do not smoke, which has the output as a binary classification. i.e., yes or no.

If the record has around 500 yes, 500 no or 600 yes, 400 no, we can consider it as a balanced dataset. But consider we have around 900 people who are smokers i.e., yes and only 100 people who are non-smokers i.e., no. It is a completely imbalanced dataset and predicted accuracy won't be matching with the actual value.

Due to this imbalanced dataset, the ML algorithms will become biased. In this case, we go for the usage of recall, precision and F-Beta to predict the accuracy.

Let us look what is precision, recall and F-Beta score now.

Precision refers to the percentage of results which are relevant. In simpler words, it tells us, "out of the outputs which were predicted as positive, how many were actually positive."

A real case scenario where precision can be used is when a patient who does not have a disease is predicted as an infected person (i.e., False Positive value). Hence, in this case, our aim should be to minimize the value of FP. So, we use precision when False Positive value is important for our analysis.

Recall refers to the percentage of total relevant results which were correctly classified by our algorithm. That is, out of the total positive values, how many were predicted as positive.

An example of this is when an infected person's test result is negative, which can lead to harmful consequences in a real-world scenario. Thus, here the False Negative value holds higher prominence and we should try to decrease this value. So we use recall when we have FN value as important.

Given below is the formula to calculate the F-Beta score.

Here the value is selected based on whether the False Positive or False Negative value plays a major role in our data. If both are equally important, we consider as 1.

If False Positive value has greater importance, value is selected between 0 to 1 and if False Negative value has a greater importance value ranges from 1 to 10.

The F-Beta score reaches its optimum value when =1 and hence we generally hear people refer to it as F1-score instead of F-Beta score.

]]>People who work on Data science problems usually perform feature engineering and Exploratory Data Analysis. This helps us to study how the data is spread and how it can be processed further. Most often data scientists generally do not want to spend much time in front-end development and in developing the user interface. Their main focus is to make the app functional and constantly develop it.

Streamlit has come as a boon to all the people who work in this field. It is an open-source Python library, which makes the web apps look simple but extremely satisfying.

Streamlit helps us to visualise the model and change the code accordingly side by side. It has some cool and exciting features like slider, button and many more that you should definitely check into.

**Installation: ** *pip install streamlit*

**Running your script:** *streamlit run [filename]*

Streamlit is compatible with almost all the major libraries and frameworks.

It helps us in building a simple API with fewer lines of code.

Adding a widget onto the web app is also a very easy task using streamlit.

Deploying is made pretty facile and we can host it ourselves or else we could also make use of the streamlit for teams.

*Learn more through the documentation and video links given below.*

]]>Github link:- https://github.com/streamlit/streamlit

Documentation :- https://www.streamlit.io/

Youtube Channel:- https://www.youtube.com/channel/UC3LD42rjj-Owtxsa6PwGU5Q/

Recently I have come across many blogs which showcased the upsides of Julia and therefore claimed Julia may become the preferred language for data scientists in the coming future. They have highlighted the strengths of Julia language. So here are some strong facts based on which we can say that python will be the ideal language for data scientists for at least the next 5-10 years.

The availability of the vast amount of libraries We can find almost any libraries inside python which makes it very unique from other languages. Be it libraries for web development, machine learning and deep learning or even for game development, we can find that python has ready library out there for our use, which makes coding easy.

The rich developer community Python is blessed with a vibrant community and hence if you opt python for any project, it is smooth sailing. For any errors that we encounter, there are some of the other sites out there to help us out.

Easy to use and fast to develop Python programming language mimics English and therefore as humans, we find it easy to code upon. Thus helps us to build, debug and develop any project in less time.

Doing more with less code This is another plus point Python language offers. The same code which may be written in around 10 lines can be shortened to 1-2 lines in Python because of the large number of inbuilt libraries.

Dynamically typed and portable Python is a dynamically typed language, that is we do not have to explicitly declare the data type of the variable we are passing. It is taken care of automatically during the execution. The code we write in Python is written once and run anywhere type. It means the code we write is system independent.

Julia language has some of the above-mentioned traits and is has faster execution rates than Python. But it has many drawbacks which are yet to overcome. Julia is yet to build a strong community like Python and it may come up with exciting features in the coming years. Working on the various weakness will surely make Julia a competitor to Python but it will surely require at the least 5-10 years.

]]>GPT-3 or Generative Pertained Transformer-3 is a neural language model from OpenAI. OpenAI launched this beta service last month. This has helped the people to understand the various amazing things that GPT-3 is capable of performing. GPT-3 has been built by feeding it with most of the web contents including coding tutorials, literature works etc. This model has around 175 billion parameters as compared to its predecessor GPT-2 which had only 1.5 billion parameters.

GPT-3 requires only a few-shot demonstration via textual information, as it is already trained with a huge amount of data.

It can develop the suitable widgets for our website using the text input we give.

It is capable of writing stories, code, poems, etc.

Furthermore, it has the ability to write any piece of article in a particular personality's essence. For eg: Write a news report in the style of J.K. Rowling.

It is generally difficult to distinguish between human written and GPT-3 generated articles. GPT-3 has been quite impressive in the beta test run and has found its place in various fields. OpenAI is planning to commercialize this product later this year. Similar to any other product, GPT-3 also has some drawbacks like;

=> Reply with the wrong answer to invalid questions.

=> GPT-3 lacks the power of reasoning and common sense. This is clearly illustrated in Kevin Lacker's blog.

The blog link:- Kevin Lacker's blog

It is a human tendency to exaggerate the negative effects of technology. But it also turns into a benefit in the long run as we can equip ourselves to overcome the negative sides.

Here, the loss of job in many fields, like web development, the writing industry, news reporting etc. have been of concern due to the enhancements in AI. It is a bitter truth that some people will surely get affected by this new technology. But the point to note here is that the one with skills need not worry as they can survive this by constantly evolving and adapting himself into the new technologies.

The thing to remember is, "change is constant."

There is no doubt that GPT-3 is a huge technical achievement in the field of AI. But more accurate and interesting advancements are yet to come.

]]>