Hacker metamorphysics: SSH

by Pablo de Castro

It is about time for my second post of this technical series. This piece will include some tips and tricks that are useful when using Secure Shell (SSH) to access remote computers. It is a must-read if you have to write your username@hostname and password to access every time, you do not know what tunnelling is or you are annoyed by the inconvenience of having to access remote files.

You might have given a terminal multiplexer a chance (checkout my previous post on this series on that topic if not) but you are still not comfortable while working with remote computers.

In this post I will review a few tricks which can be used to make accessing remote systems’ resources way less annoying. The arguments are broken down in the small bits listed in this index for your convenience:

Only type your password one more time
Keep your defaults in a .ssh/config
Are you digging tunnels?
Incrementally transferring folders and files
Virtual remote directories to the rescue

I hope these are as useful for you as they are for me! Feel free to leave doubts, feedback or corrections in the comment section.

Only type your password one more time

Do you type your SSH login password every time you access a remote server? That is both insecure (due to password reuse and possible brute forcing) and time consuming. The standard procedure for secure and password-less SSH access is using public-key cryptography, which is a rather simple concept that can be used for authentication and is available in every modern SSH client and server implementation by default.

Before we go on, beware that for CERN and FNAL shared computing resources (e.g. lxplus) the preferred access method is Kerberos authentication, which a different protocol (based on symmetric-key cryptography instead), but also easy to use in practise.

In basic terms, public-key cryptography is based on generating a pair of keys so that one of them can be used to encrypt data (public key), while the other is required to decrypt whatever is encrypted (private key).

A possible use of this technology for remote login authentication is the following: the public key is in the remote system. It is used to encrypt some data which is sent back to the system attempting to access. Using the private key, this data can be decrypted and sent back to the remote system, which will allow access only if the original and decrypted data it receives match. In addition to SSH, public-key cryptography is also relevant for GPG/PGP (e.g. signing documents), TLS/SSL authentication (e.g. browsing securely the web), or Bitcoin-like currencies.

I hope you found that explanation interesting, but do not worry if not, because as usually knowing how stuff works is not a requirement for its use. As I mentioned before, public-key authentication is built-in in every modern SSH implementation.

The first step is to generate the public and private keys, so open your preferred unix-like terminal and type:

ssh-keygen -t rsa

In the process, you will be asked for a filename for the key pair and to optionally choose a passphrase. Beware that with public-key authentication, anyone that can get hold of your private-key will be able to access all your remote systems, so you should be sure that your system files can only be accessed by you. As an additional security measure, you can decide to use a passphrase in the previous step which will be used to encrypt the private-key on your disk. It should be strong and long or anyone could crack it in less than a day.

Now the only missing step is to transfer the public-key just generated (~/.ssh/id_rsa.pub) to our remote system, specifically, it has to be appended to the ~/.ssh/authorized_keys file. We could easily do this manually (e.g. scp, copy-paste or see end of post), or with the ssh-copy-id tool which is available by default in most systems and does exactly that:

ssh-copy-id username@hostname

And that will be last time you will be asked for the SSH login password. The next time you access the remote system you will not have to type any password, because you will be authenticated using public-key encryption.

Keep your defaults in a .ssh/config

When I want to access lxplus (CERN remote computing shared resources) from one of my computers I just type:

ssh lxplus

and the username, hostname and connection options are chosen automatically by default. The same applies to every remote system I access with reasonable frequency. This is so, because I use a ~/.ssh/config configuration file, which can save you some typing overhead on the long run.

The syntax of this configuration file is pretty simple, you only need to specify a list of hosts, each using the following syntax:

Host alias
    HostName hostname
    User username

where in my example case alias would be lxplus and hostname is lxplus.cern.ch. You might think that this is not a great advantage, because it only saves a few characters every time.

You are right, but the real edge of using ssh configuration files is when you are using non-default connection options for each host (e.g. a different private-key for each, X11 forwarding, special authentication settings or a different port). Another good thing about an ssh configuration file is that you can use wildcard patterns for default behaviour and that it works with all ssh related utilities (including scp and the two described at the end of this post). You can find more information about advanced ssh configuration files in this guide and its man page.

Are you digging tunnels?

SSH tunnelling or port forwarding is a really useful way to connect remote systems, which might be not globally accessible or behind a firewall. For example, you might have a web server (e.g. a jupyter notebook server which is the topic of one of my post drafts) running in a machine that is not accessible from internet, but you can access to that machine (or a machine that can access that machine) through SSH. Supposing that web server is running at port 8888, you can tunnel the remote port to your local machine with a simple command:

ssh -L localhost:8888:localhost:8888 username@hostname

so now you can access the web server just by opening your preferred web browser and putting localhost:8888 in the URL box.

SSH tunneling can get a bit hacky, because in addition to local port forwarding we can use remote port forwarding (-R option) to connect to machines behind firewalls or SOCKS proxy (-D option) forwarding to browse the internet securely or circumvent region-dependent blocks (à la VPN). Beware that port forwarding and proxying can also be specified in the SSH configuration files.

Incrementally transferring folders and files

For me, one of the most annoying things when I was starting to work with remote systems was accessing and transferring files. The first command you are taught is the good old scp, which works fine, but is only able to transfer a single file at a time. So, if you have to transfer a whole folder or a very large file you are somehow in trouble. A few months later I discovered rsync and then I rarely used scp anymore.

What is rsync? It is a tool to a one-way synchronization of directories, where both origin and destination can be either local or remote. It is installed in most remote systems. For example, if I were to transfer a remote folder from lxplus to my local computer I would do:

rsync -aP lxplus:remote_folder_path local_folder_path

This would transfer the whole folder and its contents incrementally to my computer and show a nice progress output while doing it. The real advantage of rsync is that it does the transfer differentially. If the previous command crashes or the contents of the remote folder change, only the files that are not in the local folder already will be transferred. You can also use a local directory as source and a remote directory as destination. In case you are a HEP physicist, for extremely large/long transfers (e.g. >1TB) within the LHC Computing Grid other tools exist, so please ask your software administrator before using rsync and overloading the network connection.

Virtual remote directories to the rescue

Now imagine that you might just want to view, copy or edit locally a few small files in your remote system. You could in principle just use scp or rsync two times for that, but there is a more convenient alternative, which is called sshfs and is a network filesystem client to connect to SSH servers. sshfs might not be installed in your computer, but it is reasonably easy to do so, it is just an internet search away (my suggested search terms are “sshfs install {ubuntu,centos,mac,windows}).

Basically, this tools allows to mount the remote filesystem as if it were a folder in your local computer. It is designed so that its use is analogous to SSH, that is:

sshfs hostname:remote_dir mount_dir

This will basically mount the remote_dir from hostname in the mount_dir. If you do not specify anything after the colon, your home directory will be mounted by default. Now you can open any remote file as if it were actually in your computer. For example, you can open PDFs with your favourite viewer or edit text files with your local graphical text editor. You can also copy, move and paste files from/to your filesystem to/from the remote filesystem.

To unmount the network filesystem you can typically do:

umount mount_dir

which you might have to run as an administrator, depending on you operative system. Take into account that, depending on the network connection between the remote server and your local system, the virtual filesystem can be a bit slow, but I usually find it good enough for the purpose of viewing or editing small files.

6 thoughts on “Hacker metamorphysics: SSH”

Add yours

Pietro Vischia
14 November 2016 at 11:57


Hi Pablo,

another very nice post!

Only one thing should probably be specified: many people, after unmounting the network filesystem, execute lightly a “rm -rf mountdir”, without actually checking that the filesystem has been unmounted correctly!

So, the suggestion is:

a) check that umount has terminated correctly
b) use “rmdir mountdir”, which in case of non-empy directory terminates gracefully without deleting data.

I know an international institution in which one user, having issued “rm -rf mountdir” without checking that “umount” worked, deleted unintentionally 1 PB (yes, PetaByte) of stuff in one afternoon [1]…

[1]: it also helps if there is a good user groups policy, without lots of directories writable by anyone in the filesystem 😀

LikeLiked by 1 person

- Pablo de Castro
  14 November 2016 at 12:10
  
  
  Hi Pietro,
  
  thanks for the comment, It will most likely save some undesired data deletion to our readers. I know about the 1PB story and their protagonists 😉 . I agree completely with with suggestions you provided.
  
  I just want to add, that I never delete the directories where I later mount the remote filesystems, basically I have a folder called SSHFS where there are subfolder for each of the remote filesystems I mount occasionally.
  
  Best regards,
  Pablo
  
  LikeLike
  
GilesStrong
14 November 2016 at 14:52


Hi Pablo,
Quick question on using keys for ssh instead of passwords: if we already have some key pairs generated (for Git or something) would you recommend reusing those for sshing, or generating a new pair with a unique name?
Cheers,
Giles.

LikeLiked by 1 person

- Pablo de Castro
  15 November 2016 at 13:01
  
  
  Hi Giles,
  
  indeed you can reuse them safely as long as you are careful with the private keys (basically GitHub and other remote git providers use ssh for authentication). If you still prefer having different keys, you can use your ssh config to specify which key is use for each service (IdentityFile argument).
  
  Best regards,
  Pablo
  
  LikeLike

AMVA4NewPhysics

A Marie Sklodowska-Curie ITN funded by the Horizon2020 program of the European Commission