Sample Investigation

THE SCALABILITY OF THE AMD X86_64 ARCHITECTURE IN A HIGH PERFORMANCE CLUSTERING ENVIRONMENT USING COMMONLY AVAILABLE CONSUMER COMPUTER COMPONENTS

assembled commercial components Aaron Caveglia and Jason Petsod conducted an investigation to explore computer architecture scalability through the use of commercial computer components. They assembled four nodes, a master node and three computational nodes, from off-the-shelf components, and connected them with a 100Mbps switch. Their advisor was IMSA staff member Mr. Jim Gerry.

Each node consisted of a ASUS K8V Deluxe motherboard outfitted with a AMD Athlon™ 64 3000+ processor, 256MB of PC3200 RAM, and an ATI Radeon® 7000 graphics processor. The master node included an additional 256M of RAM, and a Western Digital 36GB Raptor® hard drive for long term storage of the operating system.


Abstract

AMD’s x86_64 architecture was a breakthrough in personal computing because it natively supported the x86 architecture while enabling 64-bit computing. In our Inquiry, we used commonly available off-the-shelf computer components and explored the outwards scalability of the architecture in various high performance clustering implementations under Gentoo Linux, a distribution of Linux we chose due to its inherent ability for customization and optimization.

Aaron and Jason present project resultsTo test scalability, we ran a battery of tests both with a kernel-based process migration clustering implementation and with applications that had integrated clustering implementations. For the latter, the tests were rerun under the kernel-based clustering implementation with each application’s clustering features disabled to compare clustering implementations.

In tasks which were influenced by latency, such as network-distributed code compilation, we saw a diminishing-returns scenario; there was a greater change between using one node to two nodes than from two nodes to three – the overall rate of change was not constant. However, in tasks such as authentication strength analysis and multimedia encoding and rendering which have nearly infinite parallelism, we found that tasks scaled linearly; changes between using one node to two nodes, two nodes to three nodes, and from three nodes to four nodes were nearly identical.