WetGeek I'm starting to get uncomfortable about continuing down this path in this thread, but I don't really want to create a new thread so I can stay on-topic.
I don't know that there is a whole lot more to say about heavyweight versus lightweight because a lot of the criteria are subjective.
My personal definition of lightweight is that the distro (and all component apps) will run comfortably (that is, no screen lags, able to open/close apps quickly, able to run three apps at the same time, and able to accommodate 8 +/- open tabs in a browser) on a ten-year-old Celeron or Pentium CPU (or AMD equivalent like A-6 or A-9), 2-4 GB of RAM, and 10-20 +/- of SSD/HDD storage space. Although many of the lightweight distros will run on less than that, to me the specs I've given are more-or-less the bottom of what I think that a Linux distro should be expected to accommodate.
Before it blew up on me, my Inspiron 11-3180 (A-6, 4 GB RAM, 32 GB eMMc SSD) was my "canary in the coal mine" test box. Windows 10 ran on it marginally, Solus Budgie ran okay so long as I didn't push it, Ubuntu (Gnome) was sluggish (almost as bad as Windows 10) and Lubuntu (Ubuntu LXDE) and Zorin 16 Lite (XFCE) both were reasonably snappy.
I'm not sure how, exactly, the objectively differentiate between lightweight and heavyweight, in terms of measurement.
The "Average Load" htop tool measures one thing and one thing only -- CPU use over stated intervals -- and the readings vary wildly over time. For example, when I was loading my 123 tabs, I saw the 1-minute average jump to 5.71 at one point, but as soon as the tabs were all loaded, the 1-minute average dropped to 0.40. The 5-minute average at that point was 1.15 and the 15-minute average was 1.09. See the whole-screen screenshot in my comment above if you want to look at the htop reading with 123 tabs open. I suspect that one of the reasons that Gnome showed well in your analysis is that the snapshot was taken after things settled down, and the Budgie/Plasma readings were taken before things settled down completely. You made an attempt to level off the processes in your second set of readings, and that improved things, but still, there is no way that Gnome is less resource-intensive than Budgie/Plasma.
The "Memory" reporting tool is more or less static depending on what is loading into memory. In my 123 tab experiment, the reported memory use climbed with each batch-install, and with all the tabs open came close to 10 GB RAM in use. Obviously, that isn't going to sit well on a computer with 2-4 GB of RAM, so performance is going to be negatively affected (both in increased CPU usage and time-delay). So the packaged apps and how the apps are used can make a big difference, and that is inherently subjective.
Tracking the processes that are running is a reasonable thing to look at, too, and distros handle processes differently to some extent. The kernel is the kernel, and will do what it will, but apps can increase the number of processes running a great deal if not carefully chosen and packaged.
So I don't know what to say at this point. I rather liked this explanation of lightweight versus heavyweight:
Lightweight distros are distributions specially designed for old and resource constraint hardware, so that the user can have a responsive and lag-free computing experience even on old hardware that has low specs in terms of processing power, disk space, and RAM. As for Heavyweight Distros, it is a concept opposite to Lightweight Distros. These distros usually have the latest and greatest features, and give users the best computing experience.
In general, weight follows the DE. The DE's used in lightweight distros have few, if any, bells and whistles, while the DE's used in heavyweight distros have lots. But at bottom, I can't figure out any single test that would reliably and consistently differentiate between the two, and the ultimate criteria ("How well does this run on my computer?") is subjective.
Sometimes llightweight doesn't win over heavyweight. It depends on the use case. When I set up the three shop computers at the railroad, I was working with marginal equipment -- 2014-era i3 CPU, onboard Intel graphics, 4 GB RAM and 80-ish GB HHD's. I replaced the HHD's with 128 GB SSD's and tested both Zorin 16 Core and Zorin 16 Lite on the computers. Both used LibreOffice Base for the database, and that is all that they used. Core was noticeably slower than Lite, but, in the end, once the database was opened in the morning, the two performed about the same. In that case, the use case could accommodate a heavyweight DE, and I could set Core up so that it looked/felt like Windows 7/10, which is what the users were used to at home.
So that's my two cents. I've enjoyed following this thread, but I'm not sure that any of it makes a lot of difference, particularly when the results seem to show both what is true (MATE is lighter in the loafers than Budgie/Gnome/Plasma) and what isn't true (Gnome is lighter in the loafers than either Plasma or Budgie).