Saturday, March 1, 2014

I love Linux, but not Nvidia!

I am really beginning to hate Nvidia! I just spent another 4 hours recovering from an update which confused the Nvidia drivers to no end.  After lots of Googling, rebooting, and testing, here's what worked for me; hopefully it will help others (there certainly are lots of posts out there with similar problems).

For the record, I'm running Kubuntu 13.10, 64-bit, with a GeForce GT630 Nvidia card and have two monitors configured side-by-side through nvidia-settings. Also for the record, the later versions of nvidia-settings actually work very well with multiple monitors, unlike earlier versions which required a great deal of hackery.

Initial symptoms
 After updating some applications I decided to reboot (I am rather paranoid these days about updates and rebooting as I have had numerous Nividia issues) I am able to load KDM (actually lightdm for my version) and after attempting to start my account, KDM drops back to the basic GUI login prompt... which repeatedly keeps being interrupted and returning to login.

The first problem I dealt with was finding and fixing any files in my home directory that were owned by root. This is a well-known problem on Ubuntu (maybe other distributions?) wherein root owns a few files and interrupts KDM loading because of permission issues. To resolve these issues, I ran
find -user root
in my home directory and it listed all files owned by root. Then I ran:
sudo chown jseidel:jseidel *
on the root-owned files to make them mine. If you have links, you may have to use the -h option on chown.

One particular problem file is .Xauthority, which sometimes, some way, gets owned by root and will certainly stop your desktop from loading, either in KDM or Gnome.

What finally worked for me
As I progressed through numerous tests, I noticed that when I dropped into a shell (Ctrl-Alt-F1), I got the message
initctl: Event failed
which can indicate an Nvidia driver issue (I've had these many times before -- always a PITA!)

When I looked in the Kernel log after a failure:
less /var/log/kern.log
I found a series of messages like:
API mismatch...
Nvidia client is version 331.49
Kernel is version 319.60
So... the Nvidia installer goofed! What finally worked for me in this situation was the following:
sudo apt-get remove --purge nvidia*
followed by reinstalling the appropriate Nvidia driver, V331.20 in my case. I do my installations from the command line like so, after downloading the desired Nvidia driver:
cd ~/Drivers
chmod +x
sudo ./

Some useful commands:
less /proc/driver/nvidia/version     # Show installed version

Some possibly useful links: