Series 1: Concurrency experiment in 1 vCPU Azure Function Host

Divakar Kumar included in category Azure and series Concurrency vs Parallelism

2023-08-11 2023-08-11 1734 words 9 minutes

https://dev-to-uploads.s3.amazonaws.com/uploads/articles/68pdep6mmdnjs4dptsjr.png

Series - Concurrency vs Parallelism

Contents

About this blog

There is been a lot of misconeption when it comes to how servers are handling concurrent requests. Few believe that they are processed parallely and few disagree to it. Actually both the answers are correct!. Let’s find how it actually works in this blog.

Single Core vs Multi Core processor

Handling concurrent requests are entirely dependent on whether your server operates on a single core processor or multi core processor. That doesn’t mean in a single core processor you won’t be able to handle multiple requests.

Each core can have more that one threads and these might not necessarily execute simultaneously. However they are managed in a way that gives the illusion of simultaneous execution. This is referred as concurrency.
Multi core processor on the other hand can achieve actual simultaneous execution commonly referred as Parallelism.

Imagine there are 100 concurrent requests hit your web server at the same time. In a single core processor, application could start by processing the first request and meanwhile it waits for a task to perform in the first request ( like waiting for data from a network or reading a datbase or reading a file), the context switches between different task that would handle the second requests and goes on. That is how it achieves concurrency (illusion of simultaneous execution) through asynchronous operations and by switching between different requests quicky.

Whereas in the multi core processor actual parallelism ( simultaneous execution of different requests) can be achieved. Each core can handle a spearate request simultaneously leading to faster overall processing. Hence multi core processors will have better throughput than the single core processor.

Important points to remember

Single Core processor

Limited Parallelism : As we have only one physical processing unit, tasks won’t execute simultaneously, though we can still make progress through asynchronous operations and context switching.

Multi Core processor

Ture parallelism : As we have multiple physical processing units, we could execute tasks on different cores simultaneously and perform concurrent execution.

Concurrency involves the execution of multiple tasks that appear to run simultaneously, while parallelism is the actual simultaneous execution of multiple tasks across multiple processors or cores.

vCPU vs Core

In Azure, I couldn’t find a direct comparison of vCPU with number of Cores. Each CPU can have one or more cores. Ideally in all our personal laptops we should be having multiple cores ( you could see those information in the Task manager -> Performance) but that is not the case in the cloud. However in the Azure pricing calculator I could see few of virtual machine series are mentioned in terms of Core and few in terms of vCPU. A and B series virtual machines have clearly said how many cores it has and other series are mentioned in vCPUs

Typically a Core is a physical unit of a CPU. And a general assumption is that 1vCPU = 1 Physical CPU Core. However this might not be the case for all scenarios.

General guidelines are mentioned here , where we could get an idea of vCPU : core ratio https://learn.microsoft.com/en-us/azure/virtual-machines/acu

Azure Function in Consumption plan

Is it a single core or multi core processor?

I have chosen Azure Function to deploy my code and test concurrency with an assumption that 1vCPU is equivalent to 1Core. We all are aware that Azure function is able to scale out to multiple instances based on the events that the Scale controller monitors. However to test the concurrency and to understand it much better I have scaled down the maximum limit to 1 instance.

Each instance of the Functions host in the Consumption plan is limited, typically to 1.5 GB of memory and one CPU. An instance of the host is the entire function app, meaning all functions within a function app share resource within an instance and scale at the same time. Function apps that share the same Consumption plan scale independently. In the Premium plan, the plan size determines the available memory and CPU for all apps in that plan on that instance. ref: https://learn.microsoft.com/en-us/azure/azure-functions/event-driven-scaling?tabs=azure-cli

In the above microsoft article it is said that typically Azure function host is said to have one CPU. However it is not clear if it is running on 1Core or multiple cores

At the end of this experiment let’s find out if we have multiple cores in 1 vCPU or is it single core.

DataFlow

We will experiment the concurrency in Azure function by sending 100 concurrent requests using Apache JMeter testing tool.
All those requests will reach the Azure function host at the same time, however the scale controller decides if it needs to scale out into multiple instances based on the number of events it monitors.
However for the testing purpose we have scaled down the instance of Azure function host to maximum one instance
Now it will reach the Azure function host (1 vCPU) to handle those requests
Concurrency model in the server is achieved by switching between the requests quickly with the help of multiple threads

Experiments

Now that we have understood the concepts, let’s see how concurrent requests are handled in a 1 vCPU Azure Function. To visualize the concurrency traces in zipkin/Jaeger/ApplicationInsights I made few tweaks in the application code.

Step 1: Simple tweak to visualize concurrency

Implement open telemetry tracing for the Azure Function HTTP trigger (Code is shared below)
We can inject middleware in Azure Function (In-Process) using this package
Deploy the application
Execute the endpoint and grab the parentSpanId and operationId.
Now I have used it in the code, traces are formed based on the parent span id that is stored in our previous step, This will ensure we could visualize the entire traces of 100 concurrent requests under a single roof.
Code looks like below

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68



using AzureFunctions.Extensions.Middleware.Abstractions;
using Microsoft.AspNetCore.Http;
using Microsoft.Azure.Amqp;
using Microsoft.Extensions.Caching.Memory;
using Microsoft.Extensions.Logging;
using System;
using System.Diagnostics;
using System.Net;
using System.Runtime.Caching;
using System.Threading.Tasks;
using MemoryCache = Microsoft.Extensions.Caching.Memory.MemoryCache;

namespace Concurrency.Otel
{
    public class OtelMiddleware : HttpMiddlewareBase
    {
        private readonly ILogger _logger;
        private readonly ActivitySource _activity;
        private readonly MemoryCache _cache;
        public OtelMiddleware(ILogger logger, Instrumentation instrumentation, MemoryCache cache)
        {
            _logger = logger;
            _activity = instrumentation.ActivitySource;
            _cache = cache;
        }

        public override async Task InvokeAsync(HttpContext httpContext)
        {
            try
            {

                string traceId = "1464e00d3f483c0f26cf85f0cc662028";
                string parentSpanId = "a3f23a942a5781e8";
                string activityTraceFlag = "Recorded";


                ActivityTraceId parentTraceIdObj = ActivityTraceId.CreateFromString(new ReadOnlySpan<char>(traceId?.ToCharArray()));
                ActivitySpanId parentSpanIdObj = ActivitySpanId.CreateFromString(new ReadOnlySpan<char>(parentSpanId?.ToCharArray()));
                ActivityTraceFlags activityTraceFlags;
                bool parseResult = Enum.TryParse<ActivityTraceFlags>(activityTraceFlag, out activityTraceFlags);

                using var activity = _activity.StartActivity($"{httpContext.Request.Path}", ActivityKind.Producer, new ActivityContext(parentTraceIdObj, parentSpanIdObj, activityTraceFlags));
                activity.SetTag("http.method", httpContext.Request.Method);
                activity.SetTag("http.http.host", httpContext.Request.Host.Value);
                activity.SetTag("http.scheme", httpContext.Request.Scheme);
                activity.SetTag("http.server_name", httpContext.Request.Host.Host);
                activity.SetTag("net.host.port", httpContext.Request.Host.Port != null ? httpContext.Request.Host.Port.Value : 443);
                activity.SetTag("http.route", httpContext.Request.Path);
                await Task.Delay(2000);
                await Next.InvokeAsync(httpContext);

            }
            catch (Exception ex)
            {
                httpContext.Response.ContentType = "application/json";

                httpContext.Response.StatusCode = (int)HttpStatusCode.InternalServerError;

                await httpContext.Response.WriteAsync(new
                {
                    StatusCode = httpContext.Response.StatusCode,
                    Message = ex.Message
                }.ToString());
            }
        }
    }
}

Step 2: Task.delay in Event publish delegate

I have created delegate to simulate the network calls/reading file/reading database that could possibly happen in any requests.
Delegate code looks like below

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57


using Azure.Messaging;
using Concurrency.Otel;
using Microsoft.Azure.WebJobs.Extensions.Sql;
using Newtonsoft.Json;
using System;
using System.Collections.Generic;
using System.Diagnostics;
using MemoryCache = Microsoft.Extensions.Caching.Memory.MemoryCache;
using System.Threading.Tasks;
using Microsoft.Extensions.Caching.Memory;

namespace CorporateService.EventsProcessor.Otel
{
    /// <summary>
    /// Otel logic wrapper for azure functions
    /// </summary>
    public class OtelWrapper : IOtelWrapper
    {
        private readonly ActivitySource _activity;
        private readonly MemoryCache _cache;
        public OtelWrapper(MemoryCache cache, Instrumentation instrumentation)
        {
            _activity = instrumentation.ActivitySource;
            _cache = cache;
        }
        /// <summary>
        /// Clean way to wrap the otel logic into a wrapper and execute the delegate passed from func
        /// </summary>
        /// <param name="processEventsFromOutbox"></param>
        /// <param name="event"></param>
        /// <returns></returns>
        public async Task<string> Execute(Func<string, Task<string>> processHttpRequest, string @event)
        {

            string traceId = "1464e00d3f483c0f26cf85f0cc662028";
            string parentSpanId = "a3f23a942a5781e8";
            string activityTraceFlag = "Recorded";

            ActivityTraceId parentTraceIdObj = ActivityTraceId.CreateFromString(new ReadOnlySpan<char>(traceId?.ToCharArray()));
            ActivitySpanId parentSpanIdObj = ActivitySpanId.CreateFromString(new ReadOnlySpan<char>(parentSpanId?.ToCharArray()));
            ActivityTraceFlags activityTraceFlags;
            bool parseResult = Enum.TryParse<ActivityTraceFlags>(activityTraceFlag, out activityTraceFlags);

            var activityName = $"event publish";
            using var activity = _activity.StartActivity(activityName, ActivityKind.Producer, new ActivityContext(parentTraceIdObj, parentSpanIdObj, activityTraceFlags));

            activity?.SetTag("messaging.system", "servicebus");
            activity?.SetTag("messaging.operation", "publish");
            activity?.SetTag("messaging.destination_kind", "topic");
            activity?.SetTag("messaging.servicebus.topic", "concurrency");

            await Task.Delay(2000);

            return await processHttpRequest(@event);
        }
    }
}

Step 3: Use Apache Jmeter to perform load test

For setup of Apache Jmeter in windows please refer my blog here :

Add the path in the HTTP request sampler as below

Step 3: Visualize in Application Insights

From above visualization it is not so clear if the requests were processed in a concurrent fashion or parallel execution.
Now let’s find out it with some queries using kql in Log Analytics workspace for our operation Id

Step 4: Execute Kql in LogWorspace

1
2
3


union *
| where timestamp > datetime("2023-08-10T13:21:49.441Z") and timestamp < datetime("2023-08-12T13:21:49.441Z")
| where operation_Id == "1464e00d3f483c0f26cf85f0cc662028" 

As we have scaled down the instance to maximum one instance, for 100 concurrent requests Scale controller doesn’t spin up any new instance and we were using Cloud Role Instance : 10-30-14-87 for all requests
There are totally 202 responses from the kql . This implies there are totally 100 spans for (/api/concurrency) + 100 spans for (event publish) + 2 spans for initial request made

1
2
3
4
5
6


union *
| where timestamp > datetime("2023-08-10T13:21:49.441Z") and timestamp < datetime("2023-08-12T13:21:49.441Z")
| where operation_Id == "1464e00d3f483c0f26cf85f0cc662028" 
| summarize RecordCount = count(), Records = make_list(pack_all()) by timestamp
| where RecordCount > 1
| project timestamp, Records

hmm! surprisingly I could see exactly at the same moment there were 2 records found to be processed. This could possibly happen only if there are multiple cores, that process the requests simultaneously

Hence our assumption on vCPU:Core ratio is wrong, at least in the context of Azure Functions.

References

https://learn.microsoft.com/en-us/azure/azure-functions/event-driven-scaling?tabs=azure-cli#limit-scale-out

Future experiments

Dynamic concurrency on Azure Blob, Azure Queue, and Service Bus triggers
will experiment on the concurrency in single core processor. Probably I will choose a virtual machine with single core and test my application there.