2.1 `random_walk_fastest` is slightly incorrect: does not include starting position `0`

There is a subtle incorrectness in `random_walk_fastest`. It does not return the first position `0`, compared to other two implementations. The following code demonstrates this:

```python
N = 99
print(len(random_walk(N)), len(random_walk_faster(N)), len(random_walk_fastest(N)))
# >>> (100, 100, 99)
```

This may look as a minor distinction, but may cause bugs. There are two ways to remove this inconsistency: exclude initial `0` position from first two functions (trivial) or prepend it to the last function. There are such options for the latter path:

 - Convert `steps` to numpy list after `cumsum` and concat with `[0]`
 - Concatenate `np.array([0])`
 - Allocate empty array in advance, and use `out` parameter of `cumsum`

My implementation of these options:

```python

N = 10_000

# Option 1: Convert `steps` to python list after `cumsum` and concat with `[0]`
def random_walk_option1(n=N):
    steps = np.random.choice([-1,+1], n)
    steps = np.cumsum(steps)
    return [0] + steps.tolist()

# Option 2.1: Concatenate `np.array([0])`
def random_walk_option2_1(n=N):
    steps = np.random.choice([-1,+1], n)
    steps = np.cumsum(steps)
    return np.concatenate((np.array([0]), steps))

# Option 2.2: Concatenate `[0]`
def random_walk_option2_2(n=N):
    steps = np.random.choice([-1,+1], n)
    steps = np.cumsum(steps)
    return np.concatenate(([0], steps))

# Option 3: Allocate empty array in advance, and use `out` parameter of `cumsum`
def random_walk_option3_1(n=N):
    walk = np.empty(n+1)
    walk[0] = 0
    steps = np.random.choice([-1,+1], n)
    np.cumsum(steps, out=walk[1:])
    return walk


def random_walk_option3_2(n=N):
    walk = np.zeros(n+1)
    steps = np.random.choice([-1,+1], n)
    np.cumsum(steps, out=walk[1:])
    return walk
```

I also benchmarked them to see the difference: https://colab.research.google.com/drive/19OJdrD4SZLk4ug6OI6DC6wq3XwHQPAgr?usp=sharing

While `option_2_1` is slightly better, the benchmark is not stable. Maybe it is because the difference in the number of operations is up to the constant, the difference is not significant?

Fixing it in a book may be nasty/non-trivial because will break the simplicity of implementation and add more confusion for the reader :)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

2.1 `random_walk_fastest` is slightly incorrect: does not include starting position `0` #119

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

2.1 random_walk_fastest is slightly incorrect: does not include starting position 0 #119

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

2.1 `random_walk_fastest` is slightly incorrect: does not include starting position `0` #119