Questions tagged [vectorization]

Vectorization refers to a programming paradigm where functions operate on whole arrays in one go. This affords benefits in terms of function calls, memory access, parallelization and code expressiveness. Some programming languages, such as MATLAB, are optimised to give the best performance when vectorized.

Vectorization refers to a programming paradigm where the process of loop-based, scalar-oriented code is instead written using matrix and vector operations. Vectorization has the following benefits:

  • Performance: Vectorized code has better performance regarding function calls and memory access, and as a result, often runs much faster than the corresponding code containing loops.

  • Appearance: Vectorized code appears more like the textbook mathematical expressions, making the code more comprehensible.

  • Less error prone: Vectorized code is shorter than loop based code, hence there are fewer opportunities to introduce programming bugs.

Some programming languages, in particular MATLAB, are optimised to give the best performance when vectorized.

6576 questions
10 answers

Why are elementwise additions much faster in separate loops than in a combined loop?

Suppose a1, b1, c1, and d1 point to heap memory, and my numerical code has the following core loop. const int n = 100000; for (int j = 0; j < n; j++) { a1[j] += b1[j]; c1[j] += d1[j]; } This loop is executed 10,000 times via another outer…
Johannes Gerer
  • 25,508
  • 5
  • 29
  • 35
12 answers

Grouping functions (tapply, by, aggregate) and the *apply family

Whenever I want to do something "map"py in R, I usually try to use a function in the apply family. However, I've never quite understood the differences between them -- how {sapply, lapply, etc.} apply the function to the input/grouped input, what…
  • 29,955
  • 34
  • 93
  • 128
11 answers

Difference between map, applymap and apply methods in Pandas

Can you tell me when to use these vectorization methods with basic examples? I see that map is a Series method whereas the rest are DataFrame methods. I got confused about apply and applymap methods though. Why do we have two methods for applying a…
  • 10,618
  • 19
  • 48
  • 63
4 answers

Is there an R function for finding the index of an element in a vector?

In R, I have an element x and a vector v. I want to find the first index of an element in v that is equal to x. I know that one way to do this is: which(x == v)[[1]], but that seems excessively inefficient. Is there a more direct way to do it? For…
Ryan C. Thompson
  • 40,856
  • 28
  • 97
  • 159
9 answers

What is "vectorization"?

Several times now, I've encountered this term in matlab, fortran ... some other ... but I've never found an explanation what does it mean, and what it does? So I'm asking here, what is vectorization, and what does it mean for example, that "a loop…
Thomas Geritzma
  • 6,337
  • 6
  • 25
  • 19
2 answers

Are for-loops in pandas really bad? When should I care?

Are for loops really "bad"? If not, in what situation(s) would they be better than using a more conventional "vectorized" approach?1 I am familiar with the concept of "vectorization", and how pandas employs vectorized techniques to speed up…
  • 379,657
  • 97
  • 704
  • 746
9 answers

Why can't R's ifelse statements return vectors?

I've found R's ifelse statements to be pretty handy from time to time. For example: ifelse(TRUE,1,2) # [1] 1 ifelse(FALSE,1,2) # [1] 2 But I'm somewhat confused by the following behavior. ifelse(TRUE,c(1,2),c(3,4)) # [1]…
Christopher DuBois
  • 42,350
  • 23
  • 71
  • 93
4 answers

Is the "*apply" family really not vectorized?

So we are used to say to every R new user that "apply isn't vectorized, check out the Patrick Burns R Inferno Circle 4" which says (I quote): A common reflex is to use a function in the apply family. This is not vectorization, it is loop-hiding.…
David Arenburg
  • 91,361
  • 17
  • 137
  • 196
6 answers

Efficient evaluation of a function at every cell of a NumPy array

Given a NumPy array A, what is the fastest/most efficient way to apply the same function, f, to every cell? Suppose that we will assign to A(i,j) the f(A(i,j)). The function, f, doesn't have a binary output, thus the mask(ing) operations won't…
  • 1,541
  • 3
  • 11
  • 11
12 answers

How can I apply a function to every row/column of a matrix in MATLAB?

You can apply a function to every item in a vector by saying, for example, v + 1, or you can use the function arrayfun. How can I do it for every row/column of a matrix without using a for loop?
  • 14,714
  • 27
  • 76
  • 97
3 answers

Why is vectorization, faster in general, than loops?

Why, at the lowest level of the hardware performing operations and the general underlying operations involved (i.e.: things general to all programming languages' actual implementations when running code), is vectorization typically so dramatically…
Ben Sandeen
  • 1,403
  • 3
  • 14
  • 17
9 answers

Do any JVM's JIT compilers generate code that uses vectorized floating point instructions?

Let's say the bottleneck of my Java program really is some tight loops to compute a bunch of vector dot products. Yes I've profiled, yes it's the bottleneck, yes it's significant, yes that's just how the algorithm is, yes I've run Proguard to…
Sean Owen
  • 66,182
  • 23
  • 141
  • 173
8 answers

numpy array TypeError: only integer scalar arrays can be converted to a scalar index

i=np.arange(1,4, a=np.arange(9).reshape(3,3) and a >>>array([[0, 1, 2], [3, 4, 5], [6, 7, 8]]) a[:,0:1] >>>array([[0], [3], [6]]) a[:,0:2] >>>array([[0, 1], [3, 4], [6,…
kinder chen
  • 1,371
  • 5
  • 15
  • 25
5 answers

Vectorized way of calculating row-wise dot product two matrices with Scipy

I want to calculate the row-wise dot product of two matrices of the same dimension as fast as possible. This is the way I am doing it: import numpy as np a = np.array([[1,2,3], [3,4,5]]) b = np.array([[1,2,3], [1,2,3]]) result = np.array([]) for…
  • 11,007
  • 19
  • 65
  • 91
1 answer

Does ifelse really calculate both of its vectors every time? Is it slow?

Does ifelse really calculate both the yes and no vectors -- as in, the entirety of each vector? Or does it just calculate some values from each vector? Also, is ifelse really that slow?
Ricardo Saporta
  • 54,400
  • 17
  • 144
  • 178
2 3
99 100