Budget for 2021 is presented and you might not be one of those who likes to spend hours watching the full speech or reading the 60 pages of pdf document released by government to understand the whole thing. Also you are only interested in focusing maybe some very specific areas and searching for some keywords you are interested in. Well we might take some help from python and its nlp libraries. I am doing some very basic things below but there is no limit to what you can do the budget text released to you. …
I recently came across data published by government of India on farm produce for different crops in different season district wise from year 1996 to 2015. I wanted to use this data to answer some questions I was interested one was the analysis of per unit area production of a particular crop across different state. I choose wheat for my analysis. but you can use any other based on the given data. According to the given data I am not sure if area is in acres or hectares and produce is in quintals or tons but that doesn’t affect our…
A Farmers’s Problem that can be solved by matrix algebra :
Suppose you are a farmer in rural India and you want to sell your rice and potato produce. There is one option to go to Mandis in more than one city to sell all your produce. Other option you have is to sell all your produce in same city. The buyer here in same city puts just one condition. Farmer can set price but he’ll buy same amount of rice and potatos. Buyers in other market put the condition that they ‘ll buy the produce at the same price…
In this article I am trying to cluster names from the names extracted from a wikipedia article. I’ll be using K-mean clustering and the distance between names will be calculated based on the word embedding vectors provided by spacy. In an earlier article we extracted names from wiki page and used spacy named entity recognizer technique to identify the names from that page.
In this article we’ll go a step further and apply unsupervised machine learning technique k-mean clustering for clustering names in different groups and then analyse the groups if they make any sense.
word embedding is a vector…
Question of interest here is to find which dynasty rule in India for the longest period of time.
Here I am trying to extract all the dynasty rules from following page and try to determine length of their rules and plot a graph to find the longest ruling dynasty https://en.wikipedia.org/wiki/List_of_Indian_monarchs
We’ll get the data from wiki apply some python techniques and turn it into a visualization. We are trying to apply and understand some python concepts in the whole process.
import pandas as pd
url = r'https://en.wikipedia.org/wiki/List_of_Indian_monarchs'
page = wikipedia.page('List_of_Indian_monarchs')
content = page.content
In this article we ‘ll try to find names of person in a wikipedia article using python spacy library. I assume that you have already installed spacy and wikipedia api libraries from pypi if you are planning to run source code from this article.
Many a time articles are too long and we are only interested in certain information. We are either interested in summary or major events and major characters associated with the current. Here we are trying to just find person names from different articles. Determining whether a word is name of a person is done using pretrained…
If you are born between early 1940s to late 1980s in hindi heartland of india or you had any other association with bollywood in that period, chances are that your childhood is dominated by memories of bollywood songs played on radio stations like Vividh bharti. I had one such childhood. Recently I came across some recommendation on my youtube account for songs from popular Binaca Geetamala from that period of time. It made me little curious to think about those days.
Now I knew pretty well who my favorite singers were and who were most popular that day but I…
Pickling in python is persisting object state at secondary storage so that you can it is present even when program terminates and we are able to recreate object state from thos file instead of running all the previous required to create object state.
This seems to be very powerful option it gets troublesome sometimes however. You need to be careful in terms what you can and you can pickle.
Here is a canonical pickle and unpickle structure. We are trying to pickle and unpickle a simple variable called t which stores an array of strings.
t = [“aa”]…
An Object is immutable if you can not change state of an object once its created.
An object has 3 property:
I got following image from https://slideplayer.com/slide/5854201/ which depicts this idea very well.
I always wonder on what to advice when I see a question like should I learn ‘Python or R’ , ‘Java or Ruby’, , ‘Lisp or Prolog’, ‘Ada or COBOL’. (I never saw the last one actually). In my opinion the answer is always obvious. Both. Probably my approach to learn a new language is little different from others.
Learning any new thing requires, time , patience , practice and motivation. One of the reasons we are in a dilemma question like above is scarcity of few of these essential elements needed for learning and mostly it is time. …
Data Scientist / Data Engineer