.NET for face recognition and classification

In entertainment venues such as playgrounds, glass bridges, and ski resorts, it is often seen that photographers are taking photos. One of the things that makes these operators worry is that there are too many photos. Customers find themselves in thousands of photos. It is not an easy task. The same is true for activities such as a play or family gatherings. Too many photos lead to difficulty in selecting.

Fortunately .NET, with a small amount of code, you can easily find faces and complete classification.

This article will use ( ) Microsoft Azureprovided by the cloud to identify and perform face classification, which can be used for free, the registered address is: https://portal.azure.com. After the registration is completed, you will get two , through which you can complete all the code in this article, this looks like this (not a real key):Cognitive ServicesAPI key

fa3a7bfd807ccd6b17cf559ad584cbaa

Instructions

First install the NuGetpackage Microsoft.Azure.CognitiveServices.Vision.Face, the latest version is currently 2.5.0-preview.1, then create one FaceClient:

string key = "fa3a7bfd807ccd6b17cf559ad584cbaa"; // replace with your key
using var fc = new FaceClient(new ApiKeyServiceClientCredentials(key))
{
    Endpoint = "https://southeastasia.api.cognitive.microsoft.com",
};

Then identify a photo:

using var file = File.OpenRead(@"C:\Photos\DSC_996ICU.JPG");
IList<DetectedFace> faces = await fc.Face.DetectWithStreamAsync(file);

What is returned facesis a IListstructure, and it is clear that multiple faces can be recognized at one time. One example returned the following result (converted to JSON):

[
    {
      "FaceId": "9997b64e-6e62-4424-88b5-f4780d3767c6",
      "RecognitionModel": null,
      "FaceRectangle": {
        "Width": 174,
        "Height": 174,
        "Left": 62,
        "Top": 559
      },
      "FaceLandmarks": null,
      "FaceAttributes": null
    },
    {
      "FaceId": "8793b251-8cc8-45c5-ab68-e7c9064c4cfd",
      "RecognitionModel": null,
      "FaceRectangle": {
        "Width": 152,
        "Height": 152,
        "Left": 775,
        "Top": 580
      },
      "FaceLandmarks": null,
      "FaceAttributes": null
    }
  ]

It can be seen that the photo returns two DetectedFaceobjects. It FaceIdsaves them Idfor subsequent recognition, and FaceRectanglesaves the position information of its face for further operations. RecognitionModel, FaceLandmarks, FaceAttributesIs the number of additional attributes, including identification sex, age , emotion and other information, the default is not identified, as shown API, the various parameters may be very interesting, we are interested can try:
Face_Recognition_and_Classification_Using_.NET_and_Azure_0.png

Finally, .GroupAsyncclassify the previously identified multiples faceIdby:

var faceIds = faces.Select(x => x.FaceId.Value).ToList();
GroupResult reslut = await fc.Face.GroupAsync(faceIds);

One was returned GroupResult, and its object definition is as follows:

public class GroupResult
{
    public IList<IList<Guid>> Groups
    {
        get;
        set;
    }

    public IList<Guid> MessyGroup
    {
        get;
        set;
    }

    // ...
}

Contains an Groupsobject and an MessyGroupobject, which Groupsis a piece of data, used to store groups of faces, and used to store those that MessyGroupcannot be found FaceId.

With this, you can copy different face groups to the corresponding folders through a short piece of code:

void CopyGroup(string outputPath, GroupResult result, Dictionary<Guid, (string file, DetectedFace face)> faces)
{
    foreach (var item in result.Groups
        .SelectMany((group, index) => group.Select(v => (faceId: v, index)))
        .Select(x => (info: faces[x.faceId], i: x.index + 1)).Dump())
    {
        string dir = Path.Combine(outputPath, item.i.ToString());
        Directory.CreateDirectory(dir);
        File.Copy(item.info.file, Path.Combine(dir, Path.GetFileName(item.info.file)), overwrite: true);
    }
    
    string messyFolder = Path.Combine(outputPath, "messy");
    Directory.CreateDirectory(messyFolder);
    foreach (var file in result.MessyGroup.Select(x => faces[x].file).Distinct())
    {
        File.Copy(file, Path.Combine(messyFolder, Path.GetFileName(file)), overwrite: true);
    }
}

Then you can get the results. As shown in the figure, I passed in a 102photo, and output 15a group and a group “no teammate found”:
Face_Recognition_and_Classification_Using_.NET_and_Azure_1.png

What else could be wrong?

Just two APIcalls, the code shuttle, feel too simple? In fact, there are still many problems.

The picture is too large and needs to be compressed

After all, take pictures uploaded to the cloud service, if the upload speed is poor, traffic is getting bigger, and now mobile phones, SLR, single micro can easily reach tens of millions of pixels, jpgthe size easily 10MB, if you do not compress it to upload , The traffic and speed can not stand.

Secondly … Actually, it is Azurenot supported either. The document (https://docs.microsoft.com/en-us/rest/api/cognitiveservices/face/face/detectwithstream) shows that the maximum supported 6MBimage is the largest and the image size should not be Greater than 1920×1080:

  • JPEG, PNG, GIF (the first frame), and BMP format are supported. The allowed image file size is from 1KB to 6MB.
  • The minimum detectable face size is 36×36 pixels in an image no larger than 1920×1080 pixels. Images with dimensions higher than 1920×1080 pixels will need a proportionally larger minimum face size.

Therefore, if the image is too large, the need for a certain degree of compression (of course, if the image is too small, apparently did not need to be compressed), use .NETof Bitmap, and in conjunction with C# 8.0the switch expression, this decision logic and code can be compressed in one go:

byte[] CompressImage(string image, int edgeLimit = 1920)
{
    using var bmp = Bitmap.FromFile(image);
    
    using var resized = (1.0 * Math.Max(bmp.Width, bmp.Height) / edgeLimit) switch
    {
        var x when x > 1 => new Bitmap(bmp, new Size((int)(bmp.Size.Width / x), (int)(bmp.Size.Height / x))), 
        _ => bmp, 
    };
    
    using var ms = new MemoryStream();
    resized.Save(ms, ImageFormat.Jpeg);
    return ms.ToArray();
}

Erected photo

Cameras are generally 3:2sensors, and photos taken are generally horizontal. But occasionally when we seek some composition, we also choose vertical composition. Although many APInow support positive and negative 30side faces, vertical faces are APIbasically not supported, as shown below (I ca n’t find a model that can authorize the use of photos 😂):

Fortunately, after the photo is taken, the exifinformation is retained , just read the exifinformation and rotate the photo accordingly:

void HandleOrientation(Image image, PropertyItem[] propertyItems)
{
    const int exifOrientationId = 0x112;
    PropertyItem orientationProp = propertyItems.FirstOrDefault(i => i.Id == exifOrientationId);
    
    if (orientationProp == null) return;
    
    int val = BitConverter.ToUInt16(orientationProp.Value, 0);
    RotateFlipType rotateFlipType = val switch
    {
        2 => RotateFlipType.RotateNoneFlipX, 
        3 => RotateFlipType.Rotate180FlipNone, 
        4 => RotateFlipType.Rotate180FlipX, 
        5 => RotateFlipType.Rotate90FlipX, 
        6 => RotateFlipType.Rotate90FlipNone, 
        7 => RotateFlipType.Rotate270FlipX, 
        8 => RotateFlipType.Rotate270FlipNone, 
        _ => RotateFlipType.RotateNoneFlipNone, 
    };
    
    if (rotateFlipType != RotateFlipType.RotateNoneFlipNone)
    {
        image.RotateFlip(rotateFlipType);
    }
}

After rotating, my photos are as follows:

In this way, the vertical picture can be recognized.

Parallel speed

As mentioned earlier, a folder may have tens of thousands of files, and uploading and identifying one by one may be slower, and its code may look like this:

Dictionary<Guid, (string file, DetectedFace face)> faces = GetFiles(inFolder)
  .Select(file => 
  {
    byte[] bytes = CompressImage(file);
    var result = (file, faces: fc.Face.DetectWithStreamAsync(new MemoryStream(bytes)).GetAwaiter().GetResult());
    (result.faces.Count == 0 ? $"{file} not detect any face!!!" : $"{file} detected {result.faces.Count}.").Dump();
    return (file, faces: result.faces.ToList());
  })
  .SelectMany(x => x.faces.Select(face => (x.file, face)))
  .ToDictionary(x => x.face.FaceId.Value, x => (file: x.file, face: x.face));

To change the speed, you can enable parallel upload. With the support of C#/ , just add a line to complete:.NETLINQ.AsParallel()

Dictionary<Guid, (string file, DetectedFace face)> faces = GetFiles(inFolder)
  .AsParallel() // 加的就是这行代码
  .Select(file => 
  {
    byte[] bytes = CompressImage(file);
    var result = (file, faces: fc.Face.DetectWithStreamAsync(new MemoryStream(bytes)).GetAwaiter().GetResult());
    (result.faces.Count == 0 ? $"{file} not detect any face!!!" : $"{file} detected {result.faces.Count}.").Dump();
    return (file, faces: result.faces.ToList());
  })
  .SelectMany(x => x.faces.Select(face => (x.file, face)))
  .ToDictionary(x => x.face.FaceId.Value, x => (file: x.file, face: x.face));

http

As mentioned above, there are thousands of photos. If the network transmission is abnormal, or the coffee on the table is overturned (who knows?) … or everything is normal, I just want to do some other analysis. Everything starts again. We can join the “breakpoint resume” mechanism often mentioned in downloads.

In fact, it is a cache that records the results of each file read, and then reads from the cache the next time you run it, and caches it into a jsonfile:

class Cache<T>
{
    static string cacheFile = outFolder + @$"\cache-{typeof(T).Name}.json";
    Dictionary<string, T> cachingData;

    public Cache()
    {
        cachingData = File.Exists(cacheFile) switch
        {
            true => JsonSerializer.Deserialize<Dictionary<string, T>>(File.ReadAllBytes(cacheFile)),
            _ => new Dictionary<string, T>()
        };
    }

    public T GetOrCreate(string key, Func<T> fetchMethod)
    {
        if (cachingData.TryGetValue(key, out T cachedValue))
        {
            return cachedValue;
        }

        var realValue = fetchMethod();
        
        lock(this)
        {
            cachingData[key] = realValue;
            File.WriteAllBytes(cacheFile, JsonSerializer.SerializeToUtf8Bytes(cachingData, new JsonSerializerOptions
            {
                WriteIndented = true, 
            }));
            return realValue;
        }
    }
}

Note that there is a lockkeyword below the code to ensure thread safety when downloading in multiple threads.

To use it, just Selectadd a line of code to it:

var cache = new Cache<List<DetectedFace>>(); // 重点
Dictionary<Guid, (string file, DetectedFace face)> faces = GetFiles(inFolder)
  .AsParallel()
  .Select(file => (file: file, faces: cache.GetOrCreate(file, () => // 重点
  {
    byte[] bytes = CompressImage(file);
    var result = (file, faces: fc.Face.DetectWithStreamAsync(new MemoryStream(bytes)).GetAwaiter().GetResult());
    (result.faces.Count == 0 ? $"{file} not detect any face!!!" : $"{file} detected {result.faces.Count}.").Dump();
    return result.faces.ToList();
  })))
  .SelectMany(x => x.faces.Select(face => (x.file, face)))
  .ToDictionary(x => x.face.FaceId.Value, x => (file: x.file, face: x.face));

Frame people’s faces

There are too many photos. If there are many activities, or there are dozens of people in the group photo

Note that the process of framing is also very tricky. Recall that the uploaded photo was originally compressed and rotated, so the returned DetectedFaceobject value is also compressed and rotated. If you do not compress and rotate, the face you found Will be completely incorrect, so we need to recalculate the previous calculation process:

using var bmp = Bitmap.FromFile(item.info.file);
HandleOrientation(bmp, bmp.PropertyItems);
using (var g = Graphics.FromImage(bmp))
{
  using var brush = new SolidBrush(Color.Red);
  using var pen = new Pen(brush, 5.0f);
  var rect = item.info.face.FaceRectangle;
  float scale = Math.Max(1.0f, (float)(1.0 * Math.Max(bmp.Width, bmp.Height) / 1920.0));
  g.ScaleTransform(scale, scale);
  g.DrawRectangle(pen, new Rectangle(rect.Left, rect.Top, rect.Width, rect.Height));
}
bmp.Save(Path.Combine(dir, Path.GetFileName(item.info.file)));

Using the photo above, the detection results are as follows (a bit like the face recognition feeling when the camera focuses):

1000 faces limit

.GroupAsyncThe method can only detect 1000one at a time FaceId, and 800there are more than 2000one of the multiple photos in the last activity FaceId, so some necessary grouping needs to be done.

Packet The easiest way is to use System.Interactivethe package, which provides Rx.NETas convenient API(those APIin LINQthat are not available), but does not require the introduction of Observable<T>heavyweight something like that, so easy to use.

I used here is .Buffer(int)a function, it may be IEnumerable<T>by the specified number of (e.g. 1000) are grouped, as follows:

foreach (var buffer in faces
  .Buffer(1000)
  .Select((list, groupId) => (list, groupId))
{
  GroupResult group = await fc.Face.GroupAsync(buffer.list.Select(x => x.Key).ToList());
  var folder = outFolder + @"\gid-" + buffer.groupId;
  CopyGroup(folder, group, faces);
}

 

Orignal link:https://www.cnblogs.com/sdflysha/p/20191122-dotnet-face-detection.html