Python: A Beginner’s Guide

Python is one of the most popular programming languages in use today. It was originally created by Guido van Rossum and first released in 1991. Python can be used for a number of different purposes, including creating web applications, read and modify files, connect to database systems, as well as handle big data and data science in general. Much like many programming languages, Python is also a fancy calculator; it can perform mathematical operations with ease.

>>> 12 + 15 # Addition
27>>> 35 – 17 # Subtraction
18>>> 5 * 5 # Multiplication
25>>> 25 / 5 # Division
5.0>>> 5 ** 2 # Exponentiation (does NOT use ^ like some calculators)
25>>> 10 % 3 # Modular division (or “mod” for short — returns a remainder)
1>>> 10 // 3 # Floor division (or “round down” division)
3

Let’s take a look at some of the basic functionality within Python, and hopefully by the end of this blog you’ll have a better understanding of what Python is, what it can do, and just how useful it is to data scientists and programmers everywhere.

Variables

Before we go any further, let’s understand and define what a variable is. A variable allows us to save data so that we can call it back, or reference it, without having to re-write the entire piece of data. Below are three different variables that we have saved to s1, s2, and s3(we’ll go into why this is bad naming practice later in the blog, but bear with me here).

s1 = 123
s2 = 678.0
s3 = “This is my first string”
s4 = ‘and this is my second string’

As discussed above, naming the variables as I did is bad naming practice so let me help you understand why. In Python, there are certain “rules” that must be followed, as well as some “rules” that are just good Pythonistic practice (see here for more information on a Style Guide for Python Code).

The “mandatory rules” are: variable names can only consist of letters, numbers, and underscores (no special characters); variable names cannot begin with numbers; and finally, you can’t name a variable after a built in Python keyword (e.g. if or else).

Some of the “good Pythonistic practice” rules are: variable names should always be descriptive (a rule I broke above; don’t name the x or df); no capital letters (these are reserved for classes); variables should not begin with an underscore (this means something special that we won’t cover in this blog post); multi-word variables should be in snake_case (all lowercase separated by underscores) rather than camelCase (new words beginning with capitals); finally, while you CAN name variables after built in Python functions (like print) it’s a very bad idea to do so.

In addition to performing our basic mathematical equations as above, we can also perform them using our saved variables. For example, adding s1+ s2 will yield 801.0. We cannot add s3 to either s1 or s2 because they’re not both numbers. Attempting to do so will cause Python to throw an error and give you a brief explanation of why this isn’t possible. That said, you CAN add s3 to s4 as they’re both what we call strings!

Python Logo. Source.

Python Data Types

Now that we understand variables a little better, let’s look at the basic data types within Python. Before we look at those, we need to understand what “data” actually means. Most people think of data as spreadsheets of information — and they aren’t too far off. Data is really just information. Everything that is run and stored in Python is considered data.

Let’s reference the examples we used above (and remind ourselves what they were below) when learning about variables again. In these examples, we saw three of the most common data types we’ll find within Python.

s1 = 123
s2 = 678.0
s3 = “This is my first string”
s4 = ‘and this is my second string’

In the first variable above, s1, we see what is called an integer — often referred to in Python as an int type. An int is a number with no decimal part — such as all of the numbers used in our mathematical example at the top.

In the second variable, s2, we see what is called a float — referred to in Python as a float type. A float is a number WITH a decimal part — even if that decimal part is a 0 like in our example above.

Finally, our third and fourth variables, s3 and s4 are called a string — and referred to in Python as a str type. A string is how we store text data in Python. Strings are just strings of characters between a set of single-quotes (‘ ‘) or double quotes (“ “). Python is ultimately indifferent to which we use to create a string as long as we’re consistent when we create a string.

Container / Collection Types

In many cases, we will want to store many values into a single variable, known as a container or a collection. The containers will hold an arbitrary number of other objects. There are a number of common containers within Python so let’s dive right in.

The first, and likely the most common, is called a list. A list is an ordered, mutable, heterogeneous collection of objects. To clarify: ordered means that the collection of objects follow a specific order, mutable means that the list can be mutated — or changed, and heterogeneous means that you can mix and match any type of object, or data type, within a list (int, float, or string). Lists are contained within a set of square brackets []. Instantiating a new list can be done in a few different ways, as seen below.

>>> new_list = [] # instantiates an empty list
[]>>> new_list = list() # also instantiates an empty list
[]>>> new_list = ['obj1', 'obj2', 'obj3'] # instantiates a list of strings
['obj1', 'obj2', 'obj3']

Less common than lists are called what is known as a tuple. Tuples are similar in that they are ordered and heterogeneous, however they differ because they are immutable — meaning that once created they cannot be mutated or changed. Tuples are contained within a set of parentheses (). Tuples are slightly faster and smaller than a list so, while their existence is partly legacy from a time when they were more useful, they still have their uses. Instantiating a new tuple can also be done in a few different ways.

>>> new_tuple = () # instantiates an empty tuple
()>>> new_tuple = tuple() # also instantiates an empty tuple
()>>> new_tuple = ('obj1', 'obj2', 'obj3') # instantiates a tuple of strings
('obj1', 'obj2', 'obj3')>>> new_tuple = ('obj1',) # tuples containing a single object need a trailing comma so Python knows it is a tuple rather than a grouping operation
('obj1',)

We also have a set. Sets are unordered, mutable, unique collection of objects (similar to traditional sets in mathematics). A set can be created by calling the set() function, which will create an empty set, by passing our information through a set of curly brackets {}, or by passing a single argument of an iterable, in which case Python will return a new set containing a single element for each unique element within our iterable.

>>> new_set = set() # instantiates an empty set
set()>>> new_set = {'A', 'A', 'B', 'C', 'C', 'D'}
{'A', 'B', 'C', 'D'}>>> new_set = set(['A', 'A', 'B', 'C', 'C', 'D'])
{'A', 'B', 'C', 'D'}

Finally, the last container we’ll discuss is a dictionary. Dictionaries are one of the most common container types that you’ll encounter in your Pythonic journey. Dictionaries are unordered, mutable key-value pairs. They can be thought of much like an actual dictionary, where the key is the “word” and the value is the “definition” of that specific word. Dictionaries, much like the other container types, can be instantiated in a number of different ways.

>>> new_dict = {} # instantiates an empty dictionary
{}>>> new_dict = dict() # also instantiates an empty dictionary
{}>>> new_dict = {'a': 1, 'b': 2, 'c': 3}
{'a': 1, 'b': 2, 'c': 3}>>> new_dict = dict(a = 1, b = 2, c = 3)
{'a': 1, 'b': 2, 'c': 3}

If you’re ever unsure what type of object you’ve just created, you can always check it by calling the type() function on the variable and Python will let you know what you’ve created.

>>> type(new_list)
list>>> type(new_tuple)
tuple>>> type(new_set)
set>>> type(new_dict)
dictAnd thinking back to our variables we created earlier...>>> type(s1)
int>>> type(s2)
float>>> type(s3)
str>>> type(s4)
str

Another Python logo. Source.

With that, we’ve looked at a (very) simple definition of Python, what a variable is and how to instantiate one, some basic data types, and some basic container / collection types within Python. While this isn’t meant to be an all-encompassing crash-course, hopefully it’s taught you enough to venture out into the world of Python, learn more, and apply it to your world. I personally use Python for data science, but as I mentioned above: the uses for Python are almost endless. If you’re interested in learning a little more about how Python is relevant to the world of data science, take a look at my blog on the pandas library within Python and how that can help. Hope you enjoyed the read and I’ll see you next time!

Comments

Popular posts from this blog

SSO — WSO2 API Manager and Keycloak Identity Manager

Garbage Collectors - Serial vs. Parallel vs. CMS vs. G1 (and what’s new in Java 8)

Recommendation System Using Word2Vec with Python