Process Optimization
Learning Goals
- Understand the concepts of time and space complexity
- Understand the impact of poor optimization
- Gain familiarity with Big0 terminology
Warm Up
Today, we are going to discuss the impact of data structures and algorithms on the programs that we write. It will be helpful to have a basic understanding of what those words mean.
Spend 5 minutes to research data structures
and algorithms
as they relate to programming. Be ready to share your own definition of these two concepts!
For the purpose of this lesson, we want students to know that:
- data structure is any kind of special format for storing data. This could be: string, array, dictionary, linked list, class, etc…
- algorithm is any process for accomplishing a goal, typically related to iterative actions, or repeatable processes.
The Time/Space Tradeoff
Data structures and algorithms are essential tools for software development - they help you organize, manipulate, and optimize your data and code. However, choosing the right data structure and algorithm for a given problem is not always straightforward, as you may have to consider the trade-offs between time and space complexity.
- Time Complexity - A measure of the amount of time needed to complete an operation or algorithm.
- Space Complexity - A measure of how much memory is needed to complete an operation or algorithm.
Time and Space are both scarce resources. If developers do not consider time/space complexity, they can inadvertently develop applications that are difficult, or impossible, to use.
Time is important because our users will refuse to use applications that take an overly long time to process data. Think about when you visit a website and it takes more than a few seconds to load; how likely are you to wait, and for how long?
Space is important because there is just only so much of it - our computers have a set amount of memory, and we can’t exceed it. Similarly, the servers that host our web applications have a set amount of memory and only a portion of that may be allocated to our application! Increasing the memory that our applications have access to costs money - a good developer keeps this in mind! Also, we need to consider space limitations that our users may have; users of web applications operate on a wide variety of hardware and software configurations that can impact the amount of space we can rely on. This means that we need to be very careful about the amount of processing we are pushing to a client.
Generally, time and space complexity have an inverse impact on each other - the better your time complexity is, the worse your space complexity, and vice-versa. So, how do we decide what to focus on? Generally, we want to focus on what we can most control, which is the time complexity of our algorithms. Though, you should always check in with your team and managers to make sure your priorities are in line with the organization!
Nested Iteration and OverIteration
Generally, when we are talking about time/space complexity, we are trying to solve some problem that involves iteration. When we need to do the same thing with or to many different records, there can be good and less-good approaches. Let’s dive in to the example below:
Let’s practice solving a problem at a high level. Just focus on writing out pseudo code, NOT actual code.
Problem:
- You have an array which contains all numbers from 1 to 1 million.
- The numbers are randomly ordered/shuffled in the array.
- One number is in the array twice, also at some random location.
-
How might you find the duplicate value?
Note
There are a couple of solutions to the problem above that have various pros and cons. Don’t worry about trying to get the perfect solution. Instead, practice breaking down the problem and thinking about the approach you could take.
Potential Solution A
Potential Solution B
Potential Solution C
Choosing an Algorithm
All of the options above will work to solve the problem, so how do we know which is ‘best’? We need some way of calculating or describing the time-complexity of an algorithm. Typically, we describe time complexity in terms of how many steps it will take to complete in relation to an increasing number of elements.
In small groups, determine how many ‘steps’ are completed for the two code snippets below.
- How many steps are represented by the current code?
- What if num[1] is a duplicate of num[0]?
- What if num[9] is a duplicate of num[5]?
- How many steps would there be if
nums
had 25 elements? - How many steps would there be if
nums
had 50 elements?
Option 1
// find the duplicate
int[] nums = { 36, 84, 75, 20, 30, 44, 17, 98, 44, 92, 27 };
int? duplicate = null; // Initialize duplicate as null
for (int i = 0; i < nums.Length; i++)
{
for (int j = i + 1; j < nums.Length; j++)
{
if (nums[i] == nums[j])
{
duplicate = nums[i];
}
}
}
Console.WriteLine(duplicate);
Option 2
int[] nums = { 36, 84, 75, 20, 92, 44, 17, 98, 92, 27 };
Dictionary<int, int> countOfNums = new Dictionary<int, int>();
foreach (var num in nums)
{
if (countOfNums.ContainsKey(num))
{
Console.WriteLine(num);
break;
}
countOfNums.Add(num, 1);
}
Big O
It’s not very useful to have to describe algorithms in exact number of steps for every possible number of elements. So, developers describe time complexity using mathematic expressions. We call these expressions Big O Notation (big ‘oh’). The BigO of an algorithm describes the worst-case scenario: what is the maximum number of steps an algorithm will take in relation to how the dataset grows.
When describing BigO, we use "n"
to signify the amount of data. The most common examples you might see include:
- O(1)
- O(log n)
- O(n)
- O(n log n)
- O(n^2) (aka, n-squared)
We’ll discuss each of these in order from least complex (best performance) to most complex (worst performance).
Constant Time - O(1)
Logarithmic Time - O(log n)
Linear Time - O(n)
Linearithmic Time - O(n log n)
Quadratic Time - O(n^2)
Some Key Considerations
Usually when calculating Big O notation, an algorithm doesn’t fall nicely into one of the categories above. It’s more likely that an algorithm combines some of the given strategies. Here are some key things to keep in mind:
- When operations are sequential, their complexities add together
- When operations are nested (within loops for example), their complexities multiply
- When calculating Big O, we don’t care about constants or coefficients. For example
O(2n + 1)
is simplified toO(n)
- When calculating Big O, if your algorithm requires several sequential steps, your Big O notation would only list the worst of those complexities. For example,
O(n + n^2)
would be simplified toO(n^2)
.
With a partner, identify the BigO notation for each of the following code snippets. Be ready to share out why you believe this is the correct identification!
Algorithm A
Algorithm B
Algorithm A is O(n). Algorithm B is O(n^2)
Checks for Understanding
- Why do we want to avoid Nested Iteration?
- What is Big O? What does it describing?
- Explain the differences between “time complexity” and “space complexity”.