18 Mar 2017, by mrexodia
Hello,
First of all, apologies for the long absence. I have been dealing with personal issues and university, so writing this blog every week was an easy thing to cross off my list of things to do (especially considering I made it rather stressful for myself to produce these). I don’t exactly know yet how I will approach this blog from now on, but it will definitely not be every week. Note: If you have time, please write an entry for this blog! You can find more information here. If you want to write something but don’t know exactly how, come in contact discuss a topic with us.
Today I would like to discuss performance and how caching can drastically improve it. If you don’t read this but use x64dbg, at least install the GetCharABCWidthsI_cache plugin to take advantage of this performance improvement…
To render those beautifully highlighted instructions, x64dbg uses a self-cooked rich-text format called CustomRichText_t
:
enum CustomRichTextFlags
{
FlagNone,
FlagColor,
FlagBackground,
FlagAll
};
struct CustomRichText_t
{
QString text;
QColor textColor;
QColor textBackground;
CustomRichTextFlags flags;
bool highlight;
QColor highlightColor;
};
This structure describes a single unit of text, with various options for highlighting it. This is extremely flexible, simple, easy to extend and doesn’t require any parsing of a text-based markup language like HTML or RTF. Since the most-used/refreshed views (disassembly, dump and stack) use this, rendering these units should be very fast and when failing to do this the user will suffer (noticeable) lag.
Now when profiling and holding down F7 (step into) I noticed that the majority of the time is spent in functions related to Qt, the first having to do with QPainter::fillRect and the second being related to QPainter::drawText. Both these functions are called very often from RichTextPainter::paintRichText.

It looks like QPainter::fillRect
is part of drawing the main window and I cannot find a way to optimize it away, but the GetCharABCWidthsI function is definitely a candidate for optimization! The root cause appears to be in a function called QWindowsFontEngine::getGlyphBearings
that is used during the layout phase of text. However GetCharABCWidthsI
returns information of the font and it only has to be retrieved once! Take a look at the code:
void QWindowsFontEngine::getGlyphBearings(glyph_t glyph, qreal *leftBearing, qreal *rightBearing)
{
HDC hdc = m_fontEngineData->hdc;
SelectObject(hdc, hfont);
if (ttf)
{
ABC abcWidths;
GetCharABCWidthsI(hdc, glyph, 1, 0, &abcWidths);
if (leftBearing)
*leftBearing = abcWidths.abcA;
if (rightBearing)
*rightBearing = abcWidths.abcC;
}
else {
QFontEngine::getGlyphBearings(glyph, leftBearing, rightBearing);
}
}
Important information here is that SelectObject is called to set the current font handle and immediately after GetCharABCWidthsI
is called to query information on a single glyph. To add a cache (and some diagnostics) I will write a plugin that hooks these functions and provides a cache of the glyph data. I’ll be using MinHook to accomplish this since it’s really easy to use.
The code for SelectObject
is pretty straightforward. The goal here is to prepare a global variable with the HFONT
handle that will be used in GetCharABCWidthsI
to get the appropriate information. Reason for this is that the function GetCurrentObject is very slow and will generate a little spike of its own in the performance profile.
static HGDIOBJ WINAPI hook_SelectObject(
HDC hdc,
HGDIOBJ h)
{
auto result = original_SelectObject(hdc, h);
auto found = fontData.find(h);
if(checkThread() && found != fontData.end())
{
curHdc = hdc;
curFont = &found->second;
}
else
{
curHdc = nullptr;
curFont = nullptr;
}
return result;
}
This function will also call checkThread()
to avoid having to deal with thread-safety and it will only select font handles that were already used by GetCharABCWidthsI
to retrieve data. The hook for GetCharABCWidthsI
is a little more involved, but shouldn’t be difficult to understand.
static BOOL WINAPI hook_GetCharABCWidthsI(
__in HDC hdc,
__in UINT giFirst,
__in UINT cgi,
__in_ecount_opt(cgi) LPWORD pgi,
__out_ecount(cgi) LPABC pabc)
{
//Don't cache if called from a different thread
if(!checkThread())
return original_GetCharABCWidthsI(hdc, giFirst, cgi, pgi, pabc);
//Get the current font object and get a (new) pointer to the cache
if(!curFont || curHdc != hdc)
{
auto hFont = GetCurrentObject(hdc, OBJ_FONT);
auto found = fontData.find(hFont);
if(found == fontData.end())
found = fontData.insert({ hFont, FontData() }).first;
curFont = &found->second;
}
curFont->count++;
//Functions to lookup/store glyph index data with the cache
bool allCached = true;
auto lookupGlyphIndex = [&](UINT index, ABC & result)
{
auto found = curFont->cache.find(index);
if(found == curFont->cache.end())
return allCached = false;
result = found->second;
return true;
};
auto storeGlyphIndex = [&](UINT index, ABC & result)
{
curFont->cache[index] = result;
};
//A pointer to an array that contains glyph indices.
//If this parameter is NULL, the giFirst parameter is used instead.
//The cgi parameter specifies the number of glyph indices in this array.
if(pgi == NULL)
{
for(UINT i = 0; i < cgi; i++)
if(!lookupGlyphIndex(giFirst + i, pabc[i]))
break;
}
else
{
for(UINT i = 0; i < cgi; i++)
if(!lookupGlyphIndex(pgi[i], pabc[i]))
break;
}
//If everything was cached we don't have to call the original
if(allCached)
{
curFont->hits++;
return TRUE;
}
curFont->misses++;
//Call original function
auto result = original_GetCharABCWidthsI(hdc, giFirst, cgi, pgi, pabc);
if(!result)
return FALSE;
//A pointer to an array that contains glyph indices.
//If this parameter is NULL, the giFirst parameter is used instead.
//The cgi parameter specifies the number of glyph indices in this array.
if(pgi == NULL)
{
for(UINT i = 0; i < cgi; i++)
storeGlyphIndex(giFirst + i, pabc[i]);
}
else
{
for(UINT i = 0; i < cgi; i++)
storeGlyphIndex(pgi[i], pabc[i]);
}
return TRUE;
}
A command abcdata
is also added to the plugin to gives some more insight in the number of cache misses and such and it appears to have been worth it (these numbers are from running x64dbg for about 20 seconds)!
HGDIOBJ: 3B0A22E9
count: 4, hits: 2, misses: 2
HGDIOBJ: A70A1E93
count: 1374, hits: 1348, misses: 26
HGDIOBJ: 000A1F1B
count: 140039, hits: 139925, misses: 114
HGDIOBJ: 7C0A2302
count: 581, hits: 550, misses: 31
The profile also confirms that this helped and I noticed a small improvement in speed!

A ticket has been opened in the Qt issue tracker and I hope this can help in further improving Qt. There have also been various suggestions on how to handle drawing lots of text which I will try another time. You can get the GetCharABCWidthsI_cache plugin if you want to try this yourself.
That’s it for today, have a good day!
Duncan
Comments
25 Dec 2016, by mrexodia
This is number sixteen of the weekly digests. Last week I have been sick so this one will again account for two weeks…
Christmas
Merry Christmas everyone!
x64dbgpylib
Some effort has been made towards supporting mona.py by porting windbglib to x64dbgpy. You can help out by porting a few functions outlined in this issue.
Translations
Various people worked very hard to completely translate x64dbg in Korean, the state of the translation is as follows:
- Korean (100%)
- Turkish (96%)
- Dutch (94%)
- Chinese Simplified (89%)
- Spanish (87%)
- German (87%)
- Russian (83%)
Restart as admin
If a process requires elevation on start, CreateProcess would fail with ERROR_ELEVATION_REQUIRED
. This is now detected and you can allow x64dbg to restart itself as administrator.

Certain operations (such as setting x64dbg as JIT debugger), also require elevation and a menu option has been added! It will automatically reload the current debuggee, but it (obviously) cannot restore the current state so think of this as the restart option.

Secure symbol servers
The default symbol servers have been switched to HTTPS. See pull request #1300 by xiaoyinl.
Microsoft symbol servers currently have issues and you might have to try to download symbols multiple times.
Fixed weird display issue on the tab bar
Issue #1339 has been fixed and the buttons to scroll in the tab bar should now appear correctly.

Various copying enhancements
There are various enhancements to copying addresses and disassembly. See pull request #1363 by ThunderCls for more details.
Executables with a malformed header, where e_lfanew
points higher than 0x1000 bytes would be detected as invalid by x64dbg. This has now been corrected by jossgray in pull request #1369.
Fixed some bugs with handling big command lines
The maximum command line size has been increased to 65k to support modification of very long command lines (such as custom JVMs with many arguments).
Launcher improvements
There have been various improvements to the launcher, mostly with .NET executables and also the handling of the IMAGE_DOS_HEADER
.
Load/free library in the symbols view
Pull request #1372 by ThunderCls introduced the freelib
command that allows you to unload a library from the debuggee. In addition to a GUI for the loadlib command.

String search improvements
There have been various improvements to the string search and UTF-8 strings will be escaped correctly.
Don’t change the active window when closing a tab
Previously if you detached a tab and pressed the close button it would keep that tab active, while usually the desired behaviour is to hide the tab in the background. See pull request #1375 by changeofpace for more details.
Workaround for a capstone bug
The instruction test eax, ecx
is incorrectly disassembled by capstone as test ecx, eax
. This has been worked around by the following ugly code that simply swaps the arguments…
//Nasty workaround for https://github.com/aquynh/capstone/issues/702
if(mSuccess && GetId() == X86_INS_TEST && x86().op_count == 2 && x86().operands[0].type == X86_OP_REG && x86().operands[1].type == X86_OP_REG)
{
std::swap(mInstr->detail->x86.operands[0], mInstr->detail->x86.operands[1]);
char* opstr = mInstr->op_str;
auto commasp = strstr(opstr, ", ");
if(commasp)
{
*commasp = '\0';
char second[32] = "";
strcpy_s(second, commasp + 2);
auto firstsp = commasp;
while(firstsp >= opstr && *firstsp != ' ')
firstsp--;
if(firstsp != opstr)
{
firstsp++;
char first[32] = "";
strcpy_s(first, firstsp);
*firstsp = '\0';
strcat_s(mInstr->op_str, second);
strcat_s(mInstr->op_str, ", ");
strcat_s(mInstr->op_str, first);
}
}
}
The option ‘Autocomments only on CIP’ would only show non-user comments on the CIP instruction. Issue #1386 proposed a different solution and currently only register-based comments will be hidden.
Save and restore the window position and size
Pull request #1385 by changeofpace introduced saving of the main window position and size.
Allow permanent highlighting mode
Some people prefer the way IDA handles highlighting. Clicking on a register/immediate will highlight it everywhere else, even if you want to keep the previous highlighting but want to click somewhere else. I personally think this is a bad way of handling highlighting, but an option has been introduced that has similar behaviour. Pull request #1388 had similar functionality, but I rewrote it to be optional and more intuitive.

If you don’t click on a highlightable object it will not change the highlighting so (unlike IDA) you can do your normal operations while keeping the desired highlighting.

Copy as HTML
Pull request #1394 by torusrxxx introduces an option that copies the disassembly/dump as HTML allowing you to paste it in Word:

Usual things
Thanks a lot to all the contributors!
That has been about it for this time again. If you have any questions, contact us on Telegram, Gitter or IRC. If you want to see the changes in more detail, check the commit log.
You can always get the latest release of x64dbg here. If you are interested in contributing, check out this page.
Finally, if someone is interested in hiring me to work on x64dbg more, please contact me!
Comments
11 Dec 2016, by mrexodia
This is number fifteen of the weekly digests. This time it will highlight the things that happened in the last two weeks, since last week wasn’t so busy.
Log redirection encoding
Previously the default log redirect option was UTF-16 with BOM, but this has been changed to support UTF-8 Everywhere. You can get the old behaviour back in the settings dialog if you favor UTF-16.

The sizes of labels and comments are limited to ~256 characters and this is now properly enforced in the GUI to avoid nasty surprises. You will now also be warned if you set a duplicate label.
Large address awareness
The 32 bit version of x64dbg previously wasn’t ‘Large address aware’. It now is, which means that you can consume more than 2GB of memory if you feel like it.
Optimized logging speed
The logging should be somewhat faster now, especially when redirecting it to a file and disabling the GUI log. You can find more details here, but the numbers might be off since additional changes were not made and no benchmarks were done.
Fixed a crash when clicking out of range in the side bar
Issue #1299 described a crash and a dump was provided but I did not have debug symbols for that particular build. To figure out what was happening I used x64dbg to debug x64dbg and then some pattern searching to find the crash location in a build for which I did have symbols. The person who opened the issue and a video is available here.
Updated Scylla
Recently a tool called pe_unmapper by malware analyst hasherezade was released and I thought it would be a nice thing to have in x64dbg so I added it to Scylla since it already had a framework to do exactly that. You can find a simple video demonstration here.
There are some new functions available for plugins that help with querying the PROCESS_INFORMATION
of the debuggee. These functions are:
BRIDGE_IMPEXP HANDLE DbgGetProcessHandle();
BRIDGE_IMPEXP HANDLE DbgGetThreadHandle();
BRIDGE_IMPEXP DWORD DbgGetProcessId();
BRIDGE_IMPEXP DWORD DbgGetThreadId();
Various improvements to the type system
Issue #1305 highlights some issues with the type system, various have been addressed and hopefully everything is a bit more stable now…
More styles
Various additional styles have been added on the wiki. Check them out below!



Case-insensitive regex search in symbol view
It is now possible to use both case sensitive and insensitive regex searching in the symbol view.

GUI speed improvements
A bad locking mechanism has been replaced by Event Objects, resulting in a noticeable performance improvement, mostly when visiting types.
Intercept more functions for crashdumps
Some crash dumps were missing information and Nukem addressed this in pull request #1338. This might help on some Windows 10 installations.
Don’t change selection when the search text changes
Thanks to lynnux’ pull request #1340 the last cursor position will now be remembered when removing the search string in the search list view. This is very useful if you want to for example find string references in close proximity to one you are looking for. Below is a GIF demonstrating this new feature.

Make x64dbg run on Wine again
There is a branch called wine that runs under Wine. The reason that x64dbg is not running under Wine is that the Concurrency::unbounded_buffer is not implemented. The branch is not very well-tested but feedback is appreciated!
Added more advanced plugin callbacks
In pull request #1314 torusrxxx added automatic detection of PEB fields as labels. This functionality has instead been moved to the LabelPEB plugin and the plugin callbacks CB_VALFROMSTRING
and CB_VALTOSTRING
have been added to allow plugins to add additional behavior to the expression resolver.

An interesting piece of documentation on access violation exceptions is now represented in x64dbg with pull request #1361 by changeofpace.
The first element of the array contains a read-write flag that indicates the type of operation that caused the access violation. If this value is zero, the thread attempted to read the inaccessible data. If this value is 1, the thread attempted to write to an inaccessible address. If this value is 8, the thread causes a user-mode data execution prevention (DEP) violation.
The second array element specifies the virtual address of the inaccessible data.

Fixed incorrect detection of unary operators
The expression (1<<5)+4
would be interpreted as incorrect because the +
was treated as a unary operator. This has now been fixed!
Remove breakpoints when clearing the database
The dbclear command didn’t remove breakpoints from the process, causing some weird behavior if you hit a breakpoint anyway. This should now be fixed.
Fixed bug with searching in the memory map
A bug has been fixed in the findallmem command where the size argument was interpreted incorrectly and thus causing searching the entire process memory to fail.
Improvements to the breakpoint view
Pull requests #1359 by ThunderCls and #1346 by ner0x652 have added some improvements to the breakpoint view. You can now see if CIP is on the current breakpoint and the edit dialog will show the full symbolic address in the title.

Find window in the attach dialog
You can now find a window by title in the attach dialog to attach to a process without knowing the PID. There is also a new config command that can be used by scripts to get/set configuration values. More details in pull request #1355.

Usual stuff
Thanks a lot to all the contributors!
That has been about it for this time again. If you have any questions, contact us on Telegram, Gitter or IRC. If you want to see the changes in more detail, check the commit log.
You can always get the latest release of x64dbg here. If you are interested in contributing, check out this page.
Finally, if someone is interested in hiring me to work on x64dbg more, please contact me!
Comments