Prévia do material em texto
A slightly better way to overclock and tweak your Nvidia GPU Hello all, in this doc i’ll try to show a more optimal and better way to overclock modern Nvidia graphics cards based on Pascal and Turing architecture. Most guides you can find online about overclocking Nvidia gpus only talk about opening an overclocking software and putting random offsets on the gpu core and gpu memory as long the card is not crashing, that’s sadly not anymore a great way to overclock Pascal and Turing cards Since Nvidia implemented the GPUBOOST technology on their GTX 600 series cards, people started facing some annoying issues like too low power limit, enforced by the card’s vbios, weird adaptive clocking behaviour making cards not run at full speeds during some specific loads and the card automatically clocking down due to reaching specified temperature “steps”. GPUBOOST is the reason why old bare offset overclocking isn’t really a great way to oc anymore, overclocking just by putting offsets will still result in, yes higher, but still really unstable core clocks and poorer performance -DISCLAIMER Overclocking is by definition, an attempt to make hardware run faster at out of spec frequencies and ranges, even though for most users and most cards the chances to damage something are pretty low, there’s always some risk involved into it. Also keep in mind that due to the milion different combinations of cards, bioses, systems in general you may experience weird/different behaviors in doing such things. Also i want to point out that this is just informative content and i do not take any responsibility if you damage your hardware of something goes wrong. -SOME OF THE THEORY BEHIND --THE VOLTAGE/FREQUENCY CURVE (V/F CURVE) Before starting we need to talk a bit on how these graphics cards handle their adaptive clocking capabilities, GPUBOOST in order to handle different gpu power states and loads rises and lowers the core frequency based on a voltage/frequency curve on the NVAPI level. Since Pascal cards nvidia implemented a new way to handle gpu clocks, these clocks are now managed by Voltage/Frequency curves, Just like Maxwell and Kepler cards, Pascal and Turing based cards have a default V/F curve, which is completely BIOS dependant, the main difference though is that now we can live-modify the curve parameters from the OS using overclocking softwares like MSI Afterburner. “Old style” core clock offsets calls are still present on the NVAPI level but that will just shift the whole curve as is a bit higher on the graph, resulting in yes higher, but still unstable boost clocks due to various limitations i will talk about later in the doc V/F curves are nothing more than just a series of voltage and frequency points located every 15 MHz and 25mV You can access to such curve by pressing CTRl+F on your keyboard while on the main page of MSI Afterburner, or by clicking the little “bars” symbol on the left side of the core clock slider There’s obviously a limit on how far frequency and voltage-wise we can actually make the gpu run on that curve, parameters like the curve itself, the maximum voltage, power and temperature limits are allowed are completely bios dependant The “voltage slider” on tools like Afterburner, “unlock” the upper part of the curve with the consequence of potentially allowing the card to access and use higher voltage points. As a general baseline, without considering power limit, (read below) most cards on most standard and non-special bioses are allowed to go up to 1.093V when the voltage slider is maxed out. Extreme OC BIOSes (XOC BIOSes) are usually not publicly available, but they exist and among the other things they do, they usually rise the maximum allowed voltage usually around 1.1-1.2V Fantastic, so should i just use the 1.093V voltage point on the curve and drag it up till a frequency i want to run? Sadly that’s not quite the case and here’s the second problem our cards will encounter, power limit --THROTTLING CAUSES --POWER LIMIT Power limit is part of GPUBOOST itself and it’s mainly meant to balance performances and power savings, but it’s really tightly configured to the point the cards are almost power starved. Having such tight power limit is nice for power efficiency but is also really disruptive for graphics cards overclock potential, limiting a lot the possible obtainable frequencies on the gpu, Every card has a power monitoring circuitry implemented on it, that as you might guess, monitors how much power the graphics card is pulling out of your PSU, when the maximum allowed power limit is reached, the card automatically steps down to a lower V/F curve point, lowering its frequency and voltage in order to keep the power draw within allowed spec. Maximum allowed power limit is also another parameter completely dependant and enforced by the vbios the card is currently using. Users are allowed to increase or decrease power limit to some extent, but not allowed to increase the power limit nearly enough to allow the card to constantly run on high Voltage/Frequency curve points, like for example 1.093V this is the main reason why graphics cards seem to randomly throttle and have a pretty unstable core frequency. Combining the previous informations, we can say that generally most Pascal and Turing cards can constantly and stably run voltages somewhat in between the 1.00-1.043V area on the V/F curve (again all this is card and BIOS dependant) --POWER LIMIT REMOVAL/MITIGATION For Nvidia’s GTX900 series and earlier generations bios modding softwares were and still are publicly available on the net, such softwares allowed users to modify critical bios parameters like, frequencies, power, temperature and voltage limitations. Sadly, since Pascal generation of cards, Nvidia started encrypting their bioses and no more bios modding tools were officially published online. So what can we do to take care about power limit? -The most effective way would be physically hardmodding the graphics card, but i won’t be covering that aspect in this doc, it’s definitely advanced stuff that ,if done wrong, can result in your graphics card becoming a nice expensive paperweight -Software-wise, there’s sadly not much that can be done to completely remove power limit, but there are some mitigations that can definitely make a difference when overclocking One of them is flashing a different bios with a higher maximum power limit allowed on your graphics card, There are some special bioses (XOC BIOSes) for top end card models like the old GTX 1080 and 1080 Ti and current top end RTX 2080 Ti that actually remove completely power limit for extreme overclocking purposes, but most of these special made bioses are private and not publicly available to use (there are some exceptions of course) Although flashing different vbioses is not a mandatory step to take, it can definitely affect how far you’ll be able to push your card, i’ll cover part of this procedure later in the doc. --THE TEMPERATURE ASPECT On Pascal and Turing cards temperature is another key factor of GPUBOOST and admittedly also heavily dictates the card’s behaviour under load and it’s overclockability These cards automatically run at higher core frequencies on lower temperatures, so effectively and properly cooling your cards is actually the most important step to take when dealing with these cards On top of that, lowering the gpu temperature also lowers the gpu power draw, possibly resulting in a higher obtainable running voltage point on the V/F curve. [THERMAL DOWNCLOCK] One of the big problems related to temperature, besides actual physical core clock stability, is the gpu automatically downclocking on its own when specific temperature thresholds are reached, these temperature thresholds are present all along the normal operating temperature range This behaviour is pretty much unavoidable on ambient temperature cooling and is tied to the card’s normal operation behaviour, (first downclocking steps starting at even 3°C) lucklythere's a workaround to stop thermal downclocking, i have to say, it’s a pretty buggy method and not always works, depending on your card and bios, but by using a very old Nvidia laptop power saving feature implemented in their drivers we can try to somewhat mitigate adaptive clocking and thermal downclocking. i’ll talk of this later in the doc --THE TL;DR -GPUBOOST, under load will automatically use higher points on the V/F curve as long Temp. or Power limit are not reached , -To overclock Pascal and Turing cards it’s better to use the NVAPI Voltage/Frequency curve, by editing a specific stable Voltage point, -Power limit is the biggest issue for Pascal and Turing overclocking, but it’s somewhat mitigable, -Temperature is a key factor for frequencies and stability -A combination of power limit and reached temperature thresholds will make cards downclock and have an unstable core clock under load --THE PRACTICAL STUFF First of all let’s prepare the basic tools needed for the work, Overclock utilities The OC utiliy: MSI Afterburner, downloadable here https://www.guru3d.com/files-details/msi-afterburner-beta-download.html General purposes graphics card monitoring and information utility: GPU-Z, downloadable here https://www.techpowerup.com/gpuz/ Stress test/Benchmarking software: UNIGINE Superposition benchmark, downloadable here https://benchmark.unigine.com/superposition Bios flashing and backupping tool Stock nvflash Modified Nvflash with board id mismatch disabled [Advanced] potential advanced clocking fix softwares (see the Potential fix to thermal downclocking section below) NVPMManager ThermSpyPremium -POWER LIMIT MITIGATION WITH BIOS FLASHING I won’t illustrate a full bios flashing guide here since it will make everything too long and confusing, but i can give you some advices. As said before, this step is not mandatory but can drastically change the result of your overclock. Little disclaimer about bios flashing, although chances to permanently break your card by flashing other bioses is quite low keep in mind that there’s always a risk when doing this kind of procedure Worst case scenario, if things go wrong there’s a high chance you will be able to recover the card by flashing again the stock bios while using the display with your igpu or with a different dedicated gpu first thing first, backup your original bios, there are different ways to do it but doing it with GPU-Z is probably the easiest, just open the software and on the right side of the interface, right next to the UEFI text there’s the bios dump button After saving your original bios you need to find a compatible bios for your card with a higher allowed power limit. First thing you need to check what’s your current max allowed power limit to do so, open the “Advanced” tab in GPU-Z , select the “Nvidia BIOS” tab on the menu and check under Power Limit section your Default and Maximum power limits Now that you know your card’s max power limit you need to find a bios that allows a higher power limit for your card, To find a potentially better bios you may want to go on TechPowerUP’s massive VGA BIOS archive https://www.techpowerup.com/vgabios/ Just select NVIDIA as GPU brand and select your card model, a list of all the uploaded bios will pop up. Sadly there’s no fast way to compare various BIOSes power limits all at once and you will have to open all the different BIOSes one by one and compare the maximum allowed power limits Here’s an example of what a bios page on TechPowerUP looks like Once you have found a bios with a higher power limit, just press the download now button and save the .rom file To flash the bios, here’s a nice guide on how to do it (Turing cards) https://www.overclockersclub.com/guides/how_to_flash_rtx_bios/ Here’s a guide for Pascal cards, the procedure is the same anyways https://www.overclock.net/forum/69-nvidia/1627212-how-flash-different-bios-your-1080-ti.html --THE ACTUAL OVERCLOCKING Ok now here’s the actual overclocking part, -GPU CORE CLOCK OC let’s start by opening MSI Afterburner and setting it up correctly, open the settings menu by clicking the gear on the UI, Open the “General” Tab in the settings and set it as show in the picture Now open the “Monitoring” tab and enable the GPU Voltage graph Hit apply and press OK, Afterburner should ask to be restarted, do it and let it reopen Now we have to prepare the graphics card for its first test run -Make sure your gpu is at stock, reset it by pressing the reset arrow in the middle of Afterburner or by pressing CTRL+D -Increase to maximum the Voltage, Temperature limit and Power limit sliders - i highly suggest you to set a fixed fan speed you’re still comfortable with noise level wise because as said earlier temperature is a key factor, so running the card cooler will help achieving higher frequencies automatically Your afterburner UI should look similar to this Now hit the Apply check mark on afterburner and open UNIGINE Superposition Benchmark Do not close Afterburner since we need it running in background monitoring the gpu for later Navigate into the “Game” tab of the software and select either 1080p extreme or 8K optimized presets, those are the heaviest presets on Superposition, 8K optimized is suggested to load up more heavily gpu memory modules, especially on high vram quantity cards Using other presets will invalidate the whole procedure due to being lighter presets on the gpu, making it draw slightly less power Now click run and wait for the test to load up Once the test loaded up, press the “Cinematic mode” button in the top left corner, Superposition will start to run infinitely till manually stopped on it’s presetted scenes Leave it running for at least 10 minutes, or more till you are sure the gpu reached its thermal and voltage stability. Once you’re done close the stress test and quickly open Afterburner, click on the “Detach” Button on the lower part of the UI to see the full lenght graphs. You need to look for the GPU Voltage graph, you’ll see it’s pretty unstable with a lot of dips and peaks, you have to find what’s your lowest dip when the card was still under load as you can see in MY particular case the lowest Voltage under load for my card with this particular bios and cooling was 1.037V https://www.geforce.com/hardware/technology/gpu-boost/technology https://www.guru3d.com/files-details/msi-afterburner-beta-download.html https://www.techpowerup.com/gpuz/ https://benchmark.unigine.com/superposition http://www.mediafire.com/file/126i7jujmcl95qi/nvflash_5.590.0.zip/file http://www.mediafire.com/file/j0c18wugictcyxw/nvflash64_patched_5.590.0.zip/file http://www.mediafire.com/file/osyoet15m2xo2lm/NVPMManagerUni.exe/file http://www.mediafire.com/file/n9dzjr34t3jmvv4/ThermSpyPremium.exe/file https://www.techpowerup.com/gpuz/ https://www.techpowerup.com/vgabios/ https://www.overclockersclub.com/guides/how_to_flash_rtx_bios/ https://www.overclock.net/forum/69-nvidia/1627212-how-flash-different-bios-your-1080-ti.html Again, keep in mind that your voltage can differ from mine Now that you know your card’s stable voltage under load we can actually getting into rising up the core clock frequency using the V/F curve Open the V/F curve on Afterburner by pressing CTRL+F And find your exact stable voltage point, again in MY case is the 1037mV point on the curve Now you need to select that particular point with your mouse and “drag” it upwards till a it matches a frequency you want to try to run on your gpu Remember Nvidia uses a 15MHz clockgen for these cards, so you should only increase the point by 15MHz at a time, For example let’s say i want to try run 2070 MHz on my gpu i have to drag the 1037mV point up to the 2070MHz mark, MHz are on the left vertical column After increasing the point till a desired frequency, keep the V/F graph still open and hit apply on afterburner If everything was done correctly now the curve should look something like this As you can see, the curve fromthe stable voltage point (in this case 1037 mV) and above is slightly higher Make sure you have no other V/F points on the same horizontal line of you stable voltage point, the stable voltage point must be the last one on its horizontal line before a slightly lower point Now last thing to do, is locking that V/F point, forcing the card to always run in P0-State and full clock speeds, to do that click on the point you just set up and lock it by pressing L on your keyboard, the point will now become yellow You can see now that GPU and MEM clocks and GPU Voltage are locked to full speed, This frequency lock will make your card run slightly hotter during idle, even though it’s not needed is highly suggested to prevent clock fluctuations during loads Now you just need to run again UNIGINE Superposition in “Game” mode and let it run for another 10 minutes at least to make sure there’s no more voltage fluctuation As you can see in the pic after running superposition again, the difference is quite dramatic, voltage line is literally a flat with no fluctuations at all and so is gpu core clock. IMPORTANT NOTE, remember that due to reached temperature thresholds gpu core clock might still downclock by a 15MHz or more steps if your card’s temperature is constantly increasing during under load Crucial thing that you have to check though is that your GPU voltage is stable and constant, this means that you nailed the card’s stable V/F point and you’re running the card within the power limit spec OK cool, but i want MOAR core clock to further increase your core clock speed till your gpu physically can’t run any higher frequency because voltage or temperature, you just have to open again the V/F curve editor on afterburner and drag the stable V/F point even more If you crash because you pushed too much i HIGHLY suggest you to reset your card completely on afterburner by pressing CTRL+D and redo the curve from scratch, still using the same stable V/F curve point of course -GPU MEMORY OC Gpu memory oc is pretty straight forward, just like old cards generations, add an offset to the stock frequency on afterburner, to keep within reasonable ranges i’d say push your memory clock by adding +250/300 at a time So basically add +250 MHz to your memory offset and run Superposition to be sure it’s stable (8K preset recommended for MEM oc stability) if it’s stable, add another 250MHz or so and so on till you crash or start losing performance --Potential fix to thermal downclocking on modern nvidia cards (BSOD RISK) --DISCLAIMER 2 before starting, as far as i could test, this should theoretically work fine on most Pascal based cards, on Turing though, this procedure can result in a bsodded driver, forcing the user to boot into safemode and reinstall the driver itself. Still not sure why sometimes works and sometimes doesn’t, i think it might have something to do with various cards and bios combinations or something else on the driver level. Anyways here i'll try to write a guide on how to try to fix the thermal downclocking issue on modern cards. --First thing first open MSI Afterburner or similar software and reset your card at stock values. --Now open nvidia control panel, click on the “desktop” tab on top and check “enable developer settings”, a new menu should pop up in the panel, open it and select “Allow access to the GPU performance counters to all users” . Now hit apply Your screen may flicker for a second and then recover --After applying that, restart your system --Once system has restarted, download and open Nsudo , select TrustedInstaller as user and check “Enable all privileges” --Now download this small script i made --click on the “browse” box on Nsudo and select the script you just downloaded --Hit Run on NSudo and a small cmd box should pop up on the screen confirming the gpu clock policy was set to unrestricted --After running the script you can now close Nsudo and the cmd prompt. --Now dowload NVPMManager, a little handy software that will apply all the registry entries automatically instead of doing it manually --Open NVPMManager with admin rights --Click on “Create PowerMizer settings” --Below Check “Enable PowerMizer feature” --Set the first 2 boxes below as shown, so “fixed performance level” Max Perf/Min Powersave --Check the “Overheat Slowdown Override” box and set it on “Disable Overheat Slowdown” --Now hit Apply and reboot and reboot your system --Now that’s the critical part, rebooting might or might not result in a BSOD, if you manage to reboot successfully, the procedure probably is working fine, if you can’t boot into windows and you get BSODS, usually BAD_POOL_HEADER or straight up nvdllkm.sys it means that something obviously is wrong. As said before that’s what i’m currently trying to figure out, i’m not yet sure on what makes this procedure work or fail This is pretty much all Turing dependant, on Pascal cards i’ve never seen it fail so far. --Continuing with the guide, if you managed to boot correctly and everything is working fine, open afterburner and you should see your gpu locked at around 700ish MHz on idle and always full speed gpu memory clock Now just set a stable curve point, like shown in the first part of the doc, so you won’t power throttle and then run something to warm up the gpu. Theoretically, now your gpu should not downclock anymore because reached temperature steps --Alternative method to potentially stop thermal downclocking http://www.mediafire.com/file/2o9f80bj0h1gtti/NSudo.exe/file http://www.mediafire.com/file/9add8z6lg6g61jb/set_unrestricted_clock_policy.bat/file http://www.mediafire.com/file/osyoet15m2xo2lm/NVPMManagerUni.exe/file --Here’s the second method to try achieving the same thing --Again, reset gpu at stock values and Allow access to all users in nvidia control panel --Run the set_unrestricted_clock_policy.bat with NSudo on TrustedIstaller user and all privileges enabled --Download ThermSpy, And open it with Nsudo always on TrustedIstaller user and all privileges enabled --Now open the Test P-States section of ThermSpy --Keep afterburner open next to ThermSpy to live check if changes take effect --Now just click on “Turn off” Right next to the adaptive clocking box (don’t mind if the adaptive clocking box resets on the ON position) --the card should now be locked at full speed frequency on both core and memory --Now set a stable curve point like shown before and run something to check if the card is still downclocking --If nothing went wrong and everything works, your gpu should not downclock anymore because of temps So far i can confirm these 2 methods work for me on my 2080Ti FTW3 on all Galax XOC bioses and also on stock 375W evga bios Here we are at the end of the doc, hope this guide and procedure was useful and might have helped someone, i’d be glad to help if you have some doubts or have experienced some issues, If you liked this guide and you found it helpful, please consider offering me a cup of coffee through a donation MY WEBSITE HWBOT PROFILE TWITTER PROFILE By Cancerogeno http://www.mediafire.com/file/9add8z6lg6g61jb/set_unrestricted_clock_policy.bat/file http://www.mediafire.com/file/2o9f80bj0h1gtti/NSudo.exe/file http://www.mediafire.com/file/n9dzjr34t3jmvv4/ThermSpyPremium.exe/file http://paypal.me/Cancerogeno https://sites.google.com/view/cancerogenoslab https://hwbot.org/user/cancretto/ https://twitter.com/Cancerogeno_