While thousands of avid gamers build their PCs with unmatched passion, data science lovers haven’t understood the benefits of using custom-built data science PCs. However, as you scale through the realm of data science, you will see the need to upgrade your hardware. Otherwise, how can you run hundreds of millions of data points in your Chromebook for deep learning?
This doesn’t mean that Chromebook can’t handle such large data sets. It can, only that it will take several days, weeks, or months before completion. Even though you can rely on your laptop or your employers’ cloud computing services, such as AWS EC2, building your PC is a prudent long-term investment for personal tech projects.
That said, read on to understand how you can build your data science PC on a budget.
You will need the following parts;
The GPU is undoubtedly the most important computer component for any data scientist, especially those engaging in deep data learning projects. Unfortunately, choosing the best graphic card is one of the challenging decisions to make. Nonetheless, if you want a PC for small projects, the GeForce RTX 2060 is a great option.
The GeForce RTX 2060 features Turing GPU technology, which is the latest GPU technology from NVIDIA. It also comes with 240 Tensor Cores, which make data learning training faster. However, the only drawback of the GeForce RTX 2060 is its memory size. It features a 6GB memory, which might be perfect for small projects and limiting for large and complicated projects.
A cheaper yet excellent alternative to the RTX 2060 is the NVIDIA GTX 1660S. This model also has a 6GB RAM, 125W TDP, and 1.83 boost clock speed. Generally, both models provide the most for your buck.
As you choose a CPU for your data science PC, you should understand the importance of having a powerful CPU for machine learning, especially when running DL algorithms. Most ML algorithms, such as Neural Networks, are parallel, which is the opposite of standard sequential methods. That said, even though you are on a budget, you shouldn’t settle for anything less than a powerful computer processor.
When it comes to this, most people have trouble choosing between AMD and Intel processors. However, for a good price and optimal performance, AMD is a great choice. Not to say that Intel processors are below average, but they are quite overpriced and overkilled. You will get similar processing power at a lower price with AMD processors.
That said, the best CPU option is the AMD Ryzen 5 3600. The unit features 12 threads, 65W TDP, and 6 cores, which make it a perfect choice for small and medium-level projects.
When using a GPU for deep data learning, data is loaded to RAM from the collection disk then transferred to a graphic card or GPU memory/VRAM. Therefore, purchasing a large RAM won’t increase the GPUs training speed or allow the training of large data batches. Therefore, you need a RAM size that is slightly over the VRAM capacity for a seamless transfer process.
Experts suggest buying a RAM twice the size as your VRAM. If you chose the RTX 2060 or GTX 1660S graphic memory with 6GB, buy a RAM with 16GB memory. In this case, the Corsair Vengeance LPX DDR4 is a good choice. Similarly, for genetic data sets or projects with large data sets that require heavy pre-processing, choose 32GB RAM.
In most cases, your CPU choice determines your choice of motherboard. This is because, unlike Intel CPUs, only a few motherboard chipsets are compatible with AMD CPUs. Gigabyte, ASUS, and MSI have established brands with reliable motherboards that have AMD chipsets. However, due to the cost factor, you should consider the MSI B450M Pro-VDH Max motherboard.
5. Storage Disks
Data science projects are large and need large disk spaces. However, you also need fast disks that won’t hinder data transfer and access speeds. Currently, there are three main types of storage disks. They include;
- HDD – they are an affordable option but have slow read and write speed. Typical R/W speeds range between 80 and 200MBs/sec due to its mechanical architecture.
- SSD – these drives provide fast data transmission of up to 1200 to 2000MB/sec. However, they are more expensive than HDDs.
- NVMe – Non-Volatile Memory Express disks are the fastest and most expensive storage disks. They provide up to 3500/3300MB/sec R/W speed.
To balance performance with the budget, you can opt for two types of drives. For instance, buy an HDD disk for storing non-active data and a small SSD or NVMe disk for active data sets. That said, the Seagate Barracuda 2TB HDD and Samsung 970 Evo Plus NVMe are a good combination.
Other essential parts to use include;
- Fans – to reduce heat generated, buy the upHere RGB Case Fan
- Power Supply Unit – the Corsair CV Series provides enough power
- Case – the NZXT H510 accommodates all components while allowing proper ventilation
The Bottom Line
Hopefully, the guide above provides enough inspiration for anyone who wants to build their data science PC. As a pro tip, ensure that you install drivers in every hardware component. This is a good way to get the most from these pieces. Additionally, use a good analytics platform to ease your deep learning projects.