So what is the solution if sequential consistency model dictate so much constraints on the compiler? Will we just live with the decrease on performance?
sir my question is can we avoid a for loop and call the omp_get_num_threads() just once? , I was trying to do it, but i failed, or it is explained in upcoming videos?
I have a doubt, dot is a shared variable so I don't think the program will produce the required output as there will be race condition involved, If somebody could clear this, that would be very greatful
How to have different number of threads across different parallel region. If i have 4 threads in first parallel region and 3 threads in second parallel region, which one going to be killed (first or fourth one).
You are pipelining the data transfer through the bus so essentially all the memory 4 byte memory transfers took 100ns but 2nd memory transfer occurs 5ns after the 1st and so on
Branching statements are like if statement, for, while loop etc. Think of it as if(flag1) {flag2 = true}. The second statement "flag2=true" is dependent on the execution of flag1 instruction. if flag2 is in pipeline and flag1 turns out to be false, second statement will be thrown away. Which basically means wastage of cpu cycles spent on processing flag=true.
The problem is with recursion. How brilliant the student must be. I thought all the 3 same problems that might occur but recursion was no where in my mind.