The story comes from a small project which aimes to automatically take screenshots of command output of ssh given a relatively big json recording lots ssh config. The challenge is that it’s not trivial to programmaticaly control input of ssh and automatically take screenshots of specific window.

At first, I plan to finish this task using C# or Dotnet which is supported by Microsoft. I believe it is the best choice to do anything specifically aiming at Windows. What’s more, I release that I cannot finish the task using only one traditionally console application. I have to use one of GUI frameworks to take screenshots. As it is known to all, GUI frameworks for Windows created Microsoft, in another word, native frameworks are extremely messy! There are old things such as WinForm, WPF, quite new but with obvious disadvantages UWP, extremely new but functionally crippled WinUI3. It’s a truly hell!

Let’s not talk about messy native frameworks first, just study technologies to take screenshots on Windows. There are three ways which only works on Windows and one way to work on cross platforms using dotnet. The first one I’ll talk about is Windows Graphics Capture which starts in Windows 10 version 1803 and aims at better experience in UWP. The advantage is that I can use this function in WinUI3, so I can use it by adding one NuGet dependency. The disadvantage is that the item to capture with a GraphicsCaptureItem class can only be constructed using GraphicsCapturePicker.PickSingleItemAsync() method which means it cannot not be programmatically automated.The next technology is DXGI desktop duplication API. It uses GPU to capture screen which helps this technology to be quite fast and less resource consuming. It looks nice, however, the tutorial provided by Microsoft is hard to understand. The sample code is too large and contains lots of unnecessary technology such as multithread. I give up learning this technology in the last. The last one is using GDI+. This is the oldest technology and I’m somewhat reluctant to learn it. The advantage is that there are a number of tutorials in the Internet. I can quickly copy one of them and make a usable example. Finally I modify one other’s project and get a program able to automatically take full screenshots. Although I think it is able to enumerate all windows and only take a screenshot of specified window, I don’t bother to learn that. Dotnet also procides cross platform technology to take screenshots. Graphics.CopyFromScreen can only take full screenshots and cannot take screenshots of specified window.

After reviewing native technologies provided by Microsoft, we can realize that some of them(Graphics.CopyFromScreen and Windows Graphics Capture) are easy to use but functionally crippled. Some of them (DXGI and GDI+) are hard to use. I just don’t want to write C++. I want some technology which is easy to use and is able to capture screenshots of specified window. I don’t care too much about performance. Other third party such as chrome provides quite easy to use API to take screenshots of specified node or the whole page. I think that electron which uses similar technology like chrome may also provides easy to use screenshot API. Yes! it provides desktopCapturer which meets all of my demand. Javascript is more easy to write than C++. I finish the task quickly using electron.

The lesson is if you just want something to finish the task quickly, you should check cross platform web technologies whether meet the demand instead of checking messy native technology first. Not to mention that Windows desktop technologies are extremely a hell.