Python Series Part 8: Data Structures

Jarret B

Well-Known Member
Staff member
Joined
May 22, 2017
Messages
368
Reaction score
422
Credits
13,526
Previously, I wrote an article on Python Basics. The article covered the different data types, such as strings, floats, etc. In this article I will finish the data types and what you can do with them. I apologize now for the length of this article. There are quite a few Data Structures to cover.

There are a lot of similarities between these data types. I hope this doesn't become monotonous in any way.

Let's start by explaining each type:
  1. List
  2. Tuple
  3. Strings (I've covered this, but not these operations)
  4. Sets
  5. Dictionaries
List Definition

A List is a collection of data that can be the duplicated. The list can be used similarly to an array as used in other languages.

Examples, we can have a List contain people's names, ages, etc.

Lists can be changed, so it is not read-only.

Tuple Definition

A Tuple is the same as a List, but it is read-only and cannot be changed other than the Tuple being deleted as a whole.

A Tuple can be accessed faster than a List so it can help with performance.

Strings Definition

A string is a sequence of characters. The data is read-only and cannot be changed. That means that if I have a string, I cannot change just a single letter or a few letters. I can re-create the string by deleting the string and completely overwrite the whole string with another.

I have already covered strings, but in this article, I am covering more abilities to manipulate strings.

Sets Definition

A Set is a collection of unique data that cannot be changed. If you place two items in the Set that are identical, one will be dropped from the list.

If we create a Set of everyone's name in a business, we may have more than one identical name. Once we create the Set, the identical names will be reduced to only one entry. Let me give an example. Let's assume we have the following names: Bill, Scott, John, Chris, Mike and John. We'll assume there are two Johns in this workplace. Once the Set is created, the Set will contain: Bill, Scott, John, Chris, Mike.

Dictionary Definition

A Dictionary is a set of data that contains a key with an associated value. The keys must be unique, while the values can be repeated.

Think of an actual Dictionary. Each entry has a key. The key is the word that you are looking up. The value is the definition of the key word. Of course, we can change this to any key and value pair. An example may be the following:

Key Value

1 One
2 Two
3 Three
4 Four
5 Five

We can spell out each number and get the value to 'translate' the number to a word.

The key and value can be strings or numeric or a combination of both.

Dictionaries can be changed, so it is not read-only.

Creating Data Structures

Creating each Data Type requires a different method. They look similar, but are different in a minor way. For the example, I will add a set of numbers to each type. The numbers will be 1,2,3,4,5. The following code sets up each type and then prints out the type of variable.

Code:
list = [1,2,3,4,5]
tuple = (1,2,3,4,5)
string = "1,2,3,4,5"
set = {1,2,3,4,5}
dictionary = {"1":"One","2":"Two","3":"Three","4":"Four","5":"Five"}

print ("List: ", type(list))
print ("Tuple: ", type(tuple))
print ("String: ", type(string))
print ("Set: ", type(set))
print ("Dictionary: ", type(dictionary))

The output is easy to understand:

Code:
 List: <class 'list'>
Tuple: <class 'tuple'>
String: <class 'str'>
Set: <class 'set'>
Dictionary: <class 'dict'>

The creation of a Data structure can be summed up as:
  1. List created with brackets [ ]
  2. Tuple created with parenthesis ( )
  3. String created with quotes " " or ' '
  4. Set created with curly braces { }
  5. Dictionary created with curly braces { }, but the key and value are in quotes and separated by a colon, with each key and value set separated by a comma
Now that we've seen how to create the various Data Structures, we can look into each individually. Please be aware that there will be areas that repeat methods, since they are similar in some ways.

List Methods and Abilities

Not all of these Data Structures can be indexed to allow picking out one or more items or elements in the sequence. We can index the List. When you index a sequence, the first item on the list is index 0, not index 1.

Let's look at the List:

Code:
list = [1,2,3,4,5]

Here, there are 5 items in the List, but the index is 0 to 4. Since the first item is index 0, we can also use a negative index to get the last item, which would be index -1. The second last item would be index 3 or -2.

We can print the whole List, including the brackets with the command:

Code:
print (list)

If you only want a specific element, you can specify the index. For example, if we want the item at index 3, which is '4', the command is:

Code:
print (list[3])

We can print all items in a list using a 'for' statement:

Code:
for element in list:
print (element)

The output will place each element on a separate line. We can place them all on one line using the 'end' parameter in the 'print' command:

Code:
print (element, end=", ")

But what if you want just a portion of the elements? This is called Slicing. If we want element at index 1 to index 3, the command is:

Code:
print (list[1:4])

You may look at the code and think, "why did he put 1 to 4?". The first index is where to start and is included in the selection. The second number is where to quit, but it is not included in the selection, so it won't be in the output.

So, if the last entry is not included, then how can we select the last element? To select the first or last element, you can leave the selection blank. I'll give two lines of code. The first is to select the items from 0 to 3 and then 3 to the end:

Code:
print (list[:4])
print (list[3:])

You can also include all elements by not specifying either a beginning or end, just a colon:

Code:
print (list[:])

Since the List is not read-only, we can add elements to the list. We have two ways to accomplish this:
  1. append - adds an item to the List
  2. extend - copies the elements from one variable to another one and includes the new items
With our example, let's add another item to the List:

Code:
list.append(6)

We specify the variable 'list' and use the method named 'append' after a period. We enter the added element in parentheses. With this method, you can only add one element at a time.

To 'extend' the List we can create another List and put the items in it we want to include in the original. So, I'll create a List named 'list1' and place into it items '6, 7'. Then the code will take the items in 'list1' and add them to 'list':

Code:
list1=[6,7]
list.extend(list1)

The List named 'list' will be extended as long as the program remains running. Once you end the program and restart it, it will go back to normal.

You can see that we can add an element, so we can also remove an item. Here, we do not specify the index; we specify the value of the element to remove. Say we want to remove '2'. The code is:

Code:
list.remove(2)

If you run that code and then print the List, you'll see the element with the value of '2' is removed. If we remove an element that exists more than once, it only removes the first element where the value is equal to what we specify.

We can find out how many elements are on the List by using the 'len' method. The command is as follows:

Code:
print (len(list))

If we are using only our initial List of 1 through 5, the output will be 5. Again, this means index 0 to 4.

After adding and removing entries, we can also sort the list. If numeric, the elements will be in numeric order, and if alphabetic, the elements will be in ascending order A to Z. The code is:

Code:
list.sort()

You can sort the elements in reverse order with:

Code:
list.sort(reverse=True)

This should give a good start on managing Lists.

Tuple Methods and Abilities

The Tuple is indexed and can be accessed just like List, even including the negative indexes. You can print the whole list using the same commands as with the List. You can also perform Slicing to get more than on element.

We can find out how many times a value appears in the Tuple with the 'count' method:

Code:
tuple=(1,2,3,4,5)
print (tuple.count(3))

The output would be '1' since the value of '3' only appears once in the Tuple.

To print out the element at index '3', the command is:

Code:
print(tuple.index(3))

Since a Tuple is write protected, we cannot add or remove elements to the Tuple.

Strings Methods and Abilities

These are normal string variables, so do not think we are covering something different.

A string can be indexed, negative indexed and you can use slicing. These methods are just the same as with List and Tuple.

Keep in mind that strings are read-only. So, let's look at an example of trying to change a string.

If I wanted to pick a specific character and change it:

Code:
str = "linux.org"
str[0] = "L"

The code will generate an error.

I can modify a string to create a new string:

Code:
str = "Jarret Buse "
str1 = "writes for Linux.org."
str = str + str1

This works since we are recreating the string. You can create a new string in the variable, but you cannot modify an existing string by a single character. Notice that the plus sign (+) can join strings.

If you want multiple lines in a string, you use three double quotes:

Code:
str1="""This is Line 1
This is Line 2
This is Line 3"""

When printing out the string, there will be three lines printed.

So, I could use the code:

Code:
print (str1[0:14])

This prints just the first line.

If we had two strings, we can compare them using two equal signs (==).

Code:
 print (str1 == str2)

If 'str1' and 'str2' were identical, the response is 'True', otherwise it is 'False'.

When I mean identical, that is identical in case as well. The strings 'Jarret' and 'jarret' are not identical.

We can go through each letter with a 'for' statement:

Code:
 for character in str1:
print (character)

Let's say the string was 'Jarret'. We can get the length of any string with:

Code:
length = len(str1)
print (length)

The output would be '6'.

If we wanted to find a character or set of characters in a string, we can use the 'find' method.

Code:
str = "This is a string made up of multiple words."
print (str.find("multiple"))

The response is '28'. If I looked for a string that didn't exist in the string, then the response would be '-1'. By default, the search starts at the beginning of the string and searches until the end of the string. We can change the area of searching by including the starting and ending spot.

Say we were to search a string for all the letter 'o'. The code is as follows:

Code:
str = "The quick red fox jumps over the lazy dog."
count = 0
start=1
place=1
while place != -1:
    place = str.find("o",start)
    if place == -1:
        break
count = count + 1
start = place + 1
print (f"There are {count} o's in '{str}'")

The 'start' value changes to the last place we found the letter 'o'. Every time we find the letter 'o', we have to increment the new start position by 1 from the last place it was found. If we start the search at the last place it was found, we will keep finding the same letter over and over.

We could find the length of the string and set that as an ending point, but it defaults to the last letter. If we did, though, the code would be:

Code:
str = "The quick red fox jumps over the lazy dog."
count = 0
start=1
place=1
end = len(str)
while place != -1:
    place = str.find("o",start, end)
if place == -1:
        break
count = count + 1
start = place + 1
print (f"There are {count} o's in '{str}'")

When using the 'find' method, the value returned is the 'index', starting at 0.

Sets

The elements in Sets cannot be changed. You can add or delete elements, but the Sets contain unique elements.

Creation is done with curly brackets. For example, if we use the names from the Definition area:

Code:
set1 = {"Bill", "Scott", "John", "Chris", "Mike", "John"}
print (set1)

The output is:

Code:
{'Bill', 'Mike', 'Scott', 'John', 'Chris'}

Now the order has not been changed except the duplicates are removed (John).

If the elements are numeric, then something different occurs:

Code:
set2 = {1,3,5,7,6,4,2}
print (set2)

The output is:

Code:
{1, 2, 3, 4, 5, 6, 7}

You can see that numeric elements are sorted.

If you create an empty Set, it will not be a Set:

Code:
set = {}
print (type(set))

The result is that Python sees it as type 'dictionary'. So, to create an empty Set, use the code:

Code:
set=set()
print (type(set))

Now we have an empty set we can add elements to as we need them.

We can add and remove elements. We just use the 'add' method to add an element:

Code:
set.add(<element value>)

You can only add one value at a time with the 'add' method.

Removing an element uses the 'discard' method.

Code:
set.discard(<element value>)

Again, you can only remove one element at a time.

You need no indexes, only the value to add/remove to/from the element.

To determine the number of elements in the Set, use the 'len' method.

We have three more methods to manipulate sets: Union, Intersection and Difference.

The Union of two Sets will combine the sets into one set, but removes unique elements.

Code:
seta = {"j","a","r","r","e","t"}
setb = {"a","e","i","o","u"}
setc = (seta.union(setb))
print (setc)

The output is:

Code:
{'e', 'u', 'j', 'o', 'i', 'r', 'a', 't'}

The Intersection of two Sets are the Set of elements that are the same in both Sets.

Code:
seta = {"j","a","r","r","e","t"}
setb = {"a","e","i","o","u"}
setc = (seta.intersection(setb))
print (setc)

The output is:

Code:
{'a', 'e'}

The Difference if two Sets will contain the elements that are in the first set but not the second set.

The output is:

Code:
seta = {"j","a","r","r","e","t"}
setb = {"a","e","i","o","u"}
setc = (seta.difference(setb))
print (setc)

The output is:

Code:
{'r', 't', 'j'}

Dictionaries

These 'Sets' are not of a single value in an element, but a 'key' and 'value' pair. An example was given in the Definition area.

To create a Dictionary is similar as creating a Set, with curly brackets. We can create a Dictionary and print out the key and values:

Code:
dict = {"1":"One","2":"Two","3":"Three","4":"Four","5":"Five"}
for key in dict:
  print (key, dict[key])

There are no indexes, you simply specify the key and you'll get the value, such as 'dict[2]'.

Adding a new key and value are just a matter of assigning them:

Code:
dict["6"]="Six"

To change a key/value pair, you can only change the value:

Code:
dict["6"]="Seven"

You cannot change a key, but you can delete it and add it again. So, to remove a key and value:

Code:
del dict["6"]

If you need to check that a key exists in the Dictionary, then you can:

Code:
print("6" in dict)

The result will be 'True' or 'False' which you can check in an 'if' statement.

The example I used has numbers as strings as keys. The keys can be alphabetic as well.

Conclusion

There was a lot of information to cover here, but this will open up a lot of ability to use arrays of various types.

Practice with these a bit to be familiar with them. We should get close to seeing an article that is a complete program.
 

Members online


Top