2017-07-27

Switching From Nouveau to the Proprietary NVIDIA driver in Debian Jessie

A couple of weeks ago, I started to experience random crashes on my 64-bit jessie desktop machine. Generally, these took the form of either a frozen desktop or a sudden blank screen, along with a complete lack of responsiveness to either mouse or keyboard.

Usually there was no obvious associated entry in any system log, but at last a series of messages appeared in the syslog file, starting with this one:

Jul 17 13:55:05 homebrew kernel: [24064.296254] nouveau E[ PFIFO][0000:01:00.0] write fault at 0x000029d000 [PTE] from GR/GPC0/GPCCS on channel 0x003fbad000 [Xorg[2071]] 

This was followed by several more messages that appeared to be related, the last of which was:

Jul 17 13:58:23 homebrew kernel: [24262.075187] nouveau E[ DRM] GPU lockup - switching to software fbcon

(On this occasion, although the desktop was non-responsive after the first message, I could still ssh into the machine, and shut it down cleanly from the ssh session, which is why there is a period of several minutes between these two messages.)

This suggested that the cause lay in the nouveau video driver, so I decided to switch to the proprietary NVIDIA driver. This turned out not to be as easy as one might expect, since there didn't seem to be a single place that defines the complete procedure  in detail. Hence this post.

Here are the steps that I followed:

1. Install the nvidia-driver package.

2. Install the nvidia-xconfig package.

3. Run nvidia-xconfig.

This complained about the lack of an xorg.conf file, but generated a default one with an nvidia entry for the driver.  There were several other errors, but rebooting at this point resulted in a system that booted and ran X.

So far so good, but during the boot sequence I noticed that the text on the system console was enormous. Similarly, if I switched to the console once the system had booted, the text appeared to be about 80x24, which is quite obnoxious on a 27-inch monitor.

Following the instructions at:

https://wiki.archlinux.org/index.php/GRUB/Tips_and_tricks#Setting_the_framebuffer_resolution

I added two lines to the file /etc/default/grub:

GRUB_GFXMODE=1280x1024x16,1024x768,auto 
GRUB_GFXPAYLOAD_LINUX=keep

DO NOT DO THIS.

After executing
  grub-mkconfig -o /boot/grub/grub.cfg
and rebooting, although the text on the console looked much better, I no longer had any X-based desktop. Switching to :0 merely gave me a blank screen. So I restored the grub.cfg file to the original version.

The above-named URL provides a deprecated mechanism for changing the console font, so that's what I ended up using. In particular, I changed one line of the /etc/default/grub file to read:

GRUB_CMDLINE_LINUX_DEFAULT="quiet vga=794"

and executed: 
  grub-mkconfig -o /boot/grub/grub.cfg

According to this documentation, this gives a 1280x1024 16-bit console, which is a somewhat lower resolution than I had with the nouveau driver, but is vastly better than the resolution without this line in the grub configuration file.

Now everything is working to my satisfaction. The only quirk I see is that at boot time, there is a LOT of disk activity for about 30 seconds after the desktop starts. I'm not sure what the reason for this might be, but at the end of it I have a fully-functioning system with a KDE desktop on :0 and i3 on :1, and can switch to a reasonable-looking console at will.

The best news is that, at least so far, I have experienced no system crashes since switching to the proprietary driver.



No comments:

Post a Comment

Note: Only a member of this blog may post a comment.