project – Derek Demuro https://www.derekdemuro.com Software Engineer Tue, 03 Jun 2025 17:12:40 +0000 en-US hourly 1 160473225 Working with GPT Partition tables, CRC32’s and byte-data in hard drives. https://www.derekdemuro.com/2017/06/30/working-with-gpt-partition-tables-crc32s-and-byte-data-in-hard-drives/ https://www.derekdemuro.com/2017/06/30/working-with-gpt-partition-tables-crc32s-and-byte-data-in-hard-drives/#respond Fri, 30 Jun 2017 06:18:31 +0000 https://www.derekdemuro.com/?p=3161 Calculating CRC32 for GPT Volumes.
GPT vs MBR
MBR vs GPT Partitioning

Today while having to manually calculate the CRC for the GPT header and the GPT partition table, I was unable to find how to perform such actions manually (by hand literally…).

GPT Volume Structure (Wikipedia Link(link is external))

From the image we can see, we can tell we have:

  1. 512 Bytes worth of emptiness (Protective MBR, ending in 55AA).
  2. 512 Bytes worth of the GPT Header.
  3. 92 bytes is the actual header the rest is unused.
  4. 128 Bytes that represent each partition.

(The previous copied at the end of the drive backwards).

OffsetLengthContents
0 (0x00)8 bytesSignature (“EFI PART”, 45h 46h 49h 20h 50h 41h 52h 54h or 0x5452415020494645ULL[a](link is external) on little-endian(link is external) machines)
8 (0x08)4 bytesRevision (for GPT version 1.0 (through at least UEFI version 2.3.1), the value is 00h 00h 01h 00h)
12 (0x0C)4 bytesHeader size in little endian (in bytes, usually 5Ch 00h 00h 00h or 92 bytes)
16 (0x10)4 bytesCRC32(link is external)/zlib of header (offset +0 up to header size) in little endian, with this field zeroed during calculation
20 (0x14)4 bytesReserved; must be zero
24 (0x18)8 bytesCurrent LBA (location of this header copy)
32 (0x20)8 bytesBackup LBA (location of the other header copy)
40 (0x28)8 bytesFirst usable LBA for partitions (primary partition table last LBA + 1)
48 (0x30)8 bytesLast usable LBA (secondary partition table first LBA – 1)
56 (0x38)16 bytesDisk GUID (also referred as UUID(link is external) on UNIXes)
72 (0x48)8 bytesStarting LBA of array of partition entries (always 2 in primary copy)
80 (0x50)4 bytesNumber of partition entries in array
84 (0x54)4 bytesSize of a single partition entry (usually 80h or 128)
88 (0x58)4 bytesCRC32/zlib of partition array in little endian
92 (0x5C)*Reserved; must be zeroes for the rest of the block (420 bytes for a sector size of 512 bytes; but can be more with larger sector sizes)
 

From Wikipedia, we can see the great table above that represents all the contents of the GPT header, for this, let’s check what the structure of a partition entry is so we can move on with a good base.

OffsetLengthContents
0 (0x00)16 bytesPartition type GUID
16 (0x10)16 bytesUnique partition GUID
32 (0x20)8 bytesFirst LBA (little endian(link is external))
40 (0x28)8 bytesLast LBA (inclusive, usually odd)
48 (0x30)8 bytesAttribute flags (e.g. bit 60 denotes read-only)
56 (0x38)72 bytesPartition name (36 UTF-16(link is external)LE code units)

Just for completeness

BitContent
0Platform required (required by the computer to function properly, OEM partition for example, disk partitioning(link is external) utilities must preserve the partition as is)
1EFI firmware should ignore the content of the partition and not try to read from it
2Legacy BIOS bootable (equivalent to active flag (typically bit 7 set) at offset +0h(link is external) in partition entries of the MBR partition table(link is external))[9](link is external)
3–47Reserved for future use
48–63Defined and used by the individual partition type
 
BitContent
60Read-only
61Shadow copy (of another partition)
62Hidden
63No drive letter (i.e. do not automount)
 

How do I make sure I get a valid GPT Partition Table?

First, make sure you are familiar with dd (data destroyer haha) is an excellent tool, and you should be familiar with it if you’re going to be playing with data on a hard drive hands-on.
Quick recap: seek, skip, bs, count, conv, if, of, should all be known to you by hand, blindfolded.

  1. seek: skip BLOCKS obs-sized blocks at start of output file.
  2. skip: skip BLOCKS obs-sized blocks at start of input file.
  3. count: copy only BLOCKS input blocks.
  4. conv: convert the file as per the comma-separated symbol list.
  5. bs: read and write BYTES at a time.
  6. if: input file.
  7. of: output file.
  8. conv=notrunc: I’m mentioning this one specifically since its 100% required to AVOID blowing up your data all at once successfully.

You will need a crc32 application to perform the calculation on dd generated binaries also because we’re going to have to write that data by hand to the files. And also, I’d recommend looking into an excuse to edit those files manually.

Remember that MANY times you’ll have to know that data on the hard drive is written backward as we usually write it in natural language. For example, if our CRC result were to be the following hex (F7E58D1F we would separate in bytes so F7 E5 8D 1F, then we’d write it as follows: 1F 8D E5 F7 in the hex editor in the appropriate places, so yeah, REMEMBER not a mirror but byte-backward, so if you do F1D85E7F you did it wrong!).

So let’s take, for example, a dump of the GPT Header and Partition table, so that’d be: 16896 bytes or 33 sectors, which would be dividing the number of bytes by 512.

Using OKTETA in Linux, you can find a quick way to calculate the checksum of a highlighted part of the dump, as the next screenshot shows.

Okteta showing how to edit the bytes
Okteta editing some EFI headers…
]]>
https://www.derekdemuro.com/2017/06/30/working-with-gpt-partition-tables-crc32s-and-byte-data-in-hard-drives/feed/ 0 3161
Jarvis – DLA (Digital Life Assistant) https://www.derekdemuro.com/2014/08/01/jarvis-dla-digital-life-assistant/ https://www.derekdemuro.com/2014/08/01/jarvis-dla-digital-life-assistant/#respond Fri, 01 Aug 2014 05:52:22 +0000 https://www.derekdemuro.com/?p=3006 The idea behind MultiAll

Abstract:

The problem of managing multiple servers that do basically similar stuff and are used to host in house applications. Avoiding configuration time hogs, and automation of server configuration allowing fast deployments with no scaling limit.

The problem:

The problem with MultiAll is the fact of having all the servers on sync when they are in different “Farms or Grids” and the fact that one server can provide services to more than one porpuse. This is why we invent the “Pool system” where all the servers respond or act for a defined pool.

If a server responds to multiple pools the aggregation of every pool is the server.

Understanding how we can make it work:

Under our research its possible to merge more than one server to work for a common goal, doing some modifications to the normal database driven apps to have a cache table, and when that server is unable to communicate to others under the same pool they save the changes they have to make to their brother servers and later on when they are back online deploy the changes.

On the file system side, every server is responsible of keeping others from the same pool up to date, bi partitioning the problem as every server is responsible for others and their self.  We’re doing similar to how Rsync works, and we would only push differences between servers under a secure connection IE: (VPN).

By this moment we would have a DB Layer abstraction for multiple servers responding to a DB and file system.

Next is the Domain Name Server problem, that we don’t want clients reaching servers that are offline due to maintenance or problems, for this we’re developing on BIND 9 an abstraction layer that every server in a pool must be “Network Aware” of others of his kind, and if hes unable to reach others, then he must change the DNS registry to reflect the changes.

How MultiAll solves the problem:

Basically MultiAll will work as a service provider inside the server, as a global abstraction layer, and once the application is provided with a bridge or connection layer to it, it would be able to take advantage of the system. MultiAll has as key features:

  1. Topology-Aware Neighboring System (TNS): provides a topology discovery service for the servers (or “nodes”), which may eventually “recalculate” the topology in case of unexpected downtime (such as network, power or hardware failures, among others) or in cases of planned downtime (such as server maintenance, hardware upgrades or migrations). The different synchronization agents of the “Multi All” depends on information learned through TNS, therefore it is considered a critical component of the “Multi All” system.
  2. System Baseline Monitor (SBM, formerly known as “checker”): provides a 24/7 server health monitoring, locally on each server (or “node”) using “Baseline Rules”. These rules are defined on a “per pool” or “per server” basis, allowing the configuration to be as granular as needed, making exceptions if the need arises. The health status is published via TNS using standardized codes known as “SBM Statuses”. It is also considered a critical component of the “Multi All” system.
  3. File System Synchronization Agent (FS-SA): lets you define structures inside your filesystem to keep synchronized across a pool of servers. FS-SA, used in combination with the rest of the “Multi All’s” Synchronization Agents (SA’s) provides you with high availability, data redundancy and server load balancing on your pool of servers.
  4. Software and Libraries Synchronization Agent (SL-SA): especially useful for large-scale unattended deployments, the SL-SA keeps all your software packages, services and libraries consistent across the pool, raising awareness to the system administrators when possible conflicts, incompatibilities or other issues arise.
  5. Database Synchronization Agent (DB-SA): keeps the different database servers of the pool synchronized. Depending on the needs of the underlying applications, the DB-SA may either work 24/7 to keep the DBs in perfect sync, or you could define your own database synchronization policies.
  6. Domain Name System Synchronization Agent (DNS-SA): keeps the DNS zones up to date with the pool’s topology either via a pull-push synchronization mechanism (handled by the FS-SA) or by rebuilding the DNS zone according to the topology discovered by the TNS.

The project has gotten a bit more ambitious… so here is how its changed!

Okay so some research has been going on, and the project has been growing quite a bit. Among the changes that have been happening around, it has gotten bigger, now all the things stated before, are part of a much larger system now.
TakeConnector will now be the daemon that will keep our infrastructure.

]]>
https://www.derekdemuro.com/2014/08/01/jarvis-dla-digital-life-assistant/feed/ 0 3006
TakeLAN Connector- Administration to the next level https://www.derekdemuro.com/2013/04/14/takelan-connector-administration-to-the-next-level/ https://www.derekdemuro.com/2013/04/14/takelan-connector-administration-to-the-next-level/#respond Sun, 14 Apr 2013 06:00:19 +0000 https://www.derekdemuro.com/?p=3061 The idea behind MultiAll.

Abstract:

The problem of managing multiple servers that do similar stuff and are used to host in house applications. Avoiding configuration time hogs and automation of server configuration, allowing fast deployments with no scaling limit.

The problem:

The problem with MultiAll is having all the servers on sync when they are indifferent “Farms or Grids” and the fact that one server can provide services to more than one purpose. This is why we invent the “Pool system” where all the servers respond or act for a defined pool.

If a server responds to multiple pools the aggregation of every pool is the server.

Understanding how we can make it work:

It’s possible to merge more than one server to work for a common goal under our research, making some modifications to the typical database-driven apps to have a cache table. When that server is unable to communicate to others under the same pool, they save the changes they have to make to their brother servers, and later on, when they are back online, deploy the changes.

On the file system side, every server is responsible for keeping others from the same pool up to date, bi partitioning the problem as every server is responsible for others and themselves.  We’re doing similar to how Rsync works, and we would only push differences between servers under a secure connection IE: (VPN).

By this moment we would have a DB Layer abstraction for multiple servers responding to a DB and file system.

Next is the Domain Name Server problem. We don’t want clients to reach servers offline due to maintenance or issues; for this, we’re developing on BIND 9 an abstraction layer that every server in a pool must be “Network-Aware” of others of his kind. If he’s unable to reach others, he must change the DNS registry to reflect the changes.

How MultiAll solves the problem:

MultiAll will work as a service provider inside the server, as a global abstraction layer. Once the application is provided with a bridge or connection layer to it, it would take advantage of the system. MultiAll has as key features:

  1. Topology-Aware Neighboring System (TNS): provides a topology discovery service for the servers (or “nodes”), which may eventually “recalculate” the topology in case of unexpected downtime (such as network, power or hardware failures, among others) or instances of planned downtime (such as server maintenance, hardware upgrades or migrations). The different synchronization agents of the “Multi All” depend on information learned through TNS; therefore, it is considered a critical component of the “Multi All” system.
  2. System Baseline Monitor (SBM, formerly known as “checker”): provides a 24/7 server health monitoring, locally on each server (or “node”) using “Baseline Rules.” These rules are defined on a “per pool” or “per server” basis, allowing the configuration to be as granular as needed, making exceptions if the need arises. The health status is published via TNS using standardized codes known as “SBM Statuses.” It is also considered a critical component of the “Multi All” system.
  3. File System Synchronization Agent (FS-SA): lets you define structures inside your filesystem to keep synchronized across a pool of servers. FS-SA, used in combination with the rest of the “Multi All’s” Synchronization Agents (SA’s), provides you with high availability, data redundancy, and server load balancing on your pool of servers.
  4. Software and Libraries Synchronization Agent (SL-SA): especially useful for large-scale unattended deployments, the SL-SA keeps all your software packages, services, and libraries consistent across the pool, raising awareness to the system administrators when possible conflicts, incompatibilities or other issues arise.
  5. Database Synchronization Agent (DB-SA): keeps the different database servers of the pool synchronized. Depending on the needs of the underlying applications, the DB-SA may keep the DBs in perfect sync, or you could define your database synchronization policies.
  6. Domain Name System Synchronization Agent (DNS-SA): keeps the DNS zones up to date with the pool’s topology either via a pull-push synchronization mechanism (handled by the FS-SA) or by rebuilding the DNS zone according to the topology discovered by the TNS.

The project has gotten a bit more ambitious… so here is how its changed!

Okay, so some research has been going on, and the project has been growing quite a bit. Among the changes that have been happening around, it has gotten bigger, now all the things stated before, are part of a much larger system.
TakeConnector will now be the daemon that will keep our infrastructure.

]]>
https://www.derekdemuro.com/2013/04/14/takelan-connector-administration-to-the-next-level/feed/ 0 3061