What is a Shell ? and How to works command ls -l

7 min readAug 21, 2020

To start, let’s take a look at what a shell is. A shell is a very powerful tool that is used to give instructions to a computer via the operating system, such as Unix. Many users today prefer to use a graphical user interface (GUI), which is a visual way to interact with the computer. However, most software engineers prefer to use the command line interface because they can access certain features that you cannot access in a graphical user interface, and can work more efficiently with the command line interface in most cases. One of the most common shells is Bash, which stands for “Bourne-again shell”. A shell comes in the form of a command line interface, which is where a user issues commands to the machine using lines of text, like in the picture below :

I gave the shell special commands like pwd, ls, mkdir, touch, and less. These are specific Unix instructions that the machine reads and then churns out an answer for us. In the case with pwd, for example, I asked the machine to let us know the current working directory that we are in (pwd stands for print working directory. Directories are files or folders in this case). You can see that I also used the command ls without any arguments. When invoked like this, (you simply type the command in the shell and then hit ‘Enter’) it will list the files in the current working directory. When typing it with a flag like “-l”, it will list files in long format (that’s what the “-l” is for).

Now, I want to get under the hood and show you what really happens underneath the neat facade of the interface. The question at hand is, what happens when you type ‘ls -l’ in your shell?

User Input

The shell will show a prompt where the user can enter a command like “ls -l”. This is called the user input. The shell reads the user input with a function called getline(). This is the typical getline() syntax:

getline(&buffer,&size,stdin);

The buffer is the address of the first character position where the input string is stored. The size is the address of the variable that contains the size of the input buffer (a pointer). And finally, the stdin is the input file handle. In this case, getline() is used to read a line of text in standard input and stores the address of the buffer of the user input “ls -l”.

Alias and Expansions

The shell will first check for special characters like “, ‘, `, \, *, &, # , etc. In a prior blog post, I explained how the command “ls *.c” works in the shell. The * is a special character called a wildcard that can be used to tell the shell to look only for and list files ending in a certain suffix (in this case .c), such as main.c, shell.c, or holberton.c. The shell will also check for alias, which is a shortcut name for a command, file or anything in the shell. This allows users to save time. No

Built-ins

If the shell doesn’t find any special characters or alias, then it will check for built-ins. Built-ins in the shell are commands or functions that execute directly in the shell itself.

But what is really happened behind this?

The first thing that you see when you are running a shell is the prompt. A prompt should be displayed to let the user type the commands, and it would look like this:

2. Now the shell is waiting for get the user input. The shell reads the user input with a function called getline(), reads a line of text in standard input and stores the address of the buffer of the user input “ls -l”.

$ ls -l

3. Now that we have read, we need to split the iput in individual tokens. Those tokens should be stored in an array (Don´t forget to allocate memory for this array, and free it after used it). You can do this using a strtok function.

4. After we have our arguments stored as tokens, these need to be analyzed before execution, and see if these are aliases (shortcut name for a command, file or anything in the shell), built-in commands(commands or functions that execute directly in the shell itself. In this case we are checking env, printenv and exit) or if is a PATH(environment variable which specifies directories for executables in an operating system), in that specific order.

In the case of the PATH, we need to get the environment of the variable with the getenv() function, and we would get something like this:

PATH=/home/vagrant/.vscode-server/bin/db40434f562994116e5b21c24015a2e40b2504e6/bin:/home/vagrant/.vscode-server/bin/db40434f562994116e5b21c24015a2e40b2504e6/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games

We also need to tokenized this path to get each one individually.

5. The next step now is execute the program. For this we need to use the system calls fork(), execv() and wait().

The concept of processes is fundamental to the Unix/Linux operating systems, and all running instance of a program is known as a process. The way to distinguish processes has it’s by ID or identifier and each process has it’s own ID. The ID is a non negative number and associated with the process.

Img 2: Graphical System Calls ( taken from geeksforgeeks)

With fork we are going to create our new process( called child process). A parent process uses fork to create a new child process. The child process is a copy of the parent. After fork, both parent and child executes the same program but in separate processes.

After we have create de child process, we use execve to execute ls command and will replace the child process with “ls”, the parent process will wait until the child process completes its execution. Exec system call replaces the program executed by a process. The child may use exec after a fork to replace the process’ memory space with a new program executable making the child execute a different program than the parent.

Once “ls” is executed it goes back to the beginning, prints the prompt, and waits for more user input. This cycle continues until the user runs the exit built-in command, or enters ctrl-D.

Functions and system calls used

GETLINE:

For more information type in yout terminal man 3 getline.

man 3 getline

The arguments of the getline function are:

The first argument is the address of the first character position where the input string will be stored.
The second argument is the address of the variable that holds the size of the input buffer, another pointer.
The thrid arguments is the input file handle, stdin in this case. So you could use getline() to read a line of text from a file, but when stdin is specified, standard input is read.
Return: The number of characters read on success, and -1 on failure reading the line.

STRTOK:

For more information type in yout terminal man 3 strtok.

man 3 strtok

The arguments of the strtokfunction are:

The first argument is the string to be split.
The second argument is the delimiter
Return: A pointer to the next token, and NULL if there is no more tokens.

GETENV:

man 3 getenv

For more information type in yout terminal man 3 getenv.

The first argument is environment variable name.
Return: a pointer to the value in the environment, or NULL if there is no match

FORK:

man 2 fork

For more information type in yout terminal man 2 fork.

Return: On success, the PID of the child process is returned in the parent, and 0 is returned in the child. On failure, -1 is returned in the parent, no child process is created.

EXECVE:

man 2 execve

For more information type in yout terminal man 2 execve.

Return: Nothing on succes, and -1 on failure

WAIT:

man 2 wait

For more information type in yout terminal man 2 wait.

Return: on success, returns the process ID of the terminated child; on error, -1 is returned.

I hope you find this information useful to understand a little bit more what is the process a shell is doing behind, when we want to execute a simple command like ls -l.