What Is the sscanf() C Function
In the C programming language, the sscanf()
function lets you read data from a string, similar to how you might read data from standard input (stdin
) using the scanf()
function. However, unlike scanf(), sscanf() does not read from standard input—it reads from a character string. The function’s syntax is as follows:
int sscanf(const char *str, const char *format, ...);
The name ‘sscanf’ stands for “string scanf”. The function is included in the stdio.h header file, which needs to be included at the beginning of your code if you want to use this function. The versatility of sscanf()
lies in its ability to read and interpret data from a string based on a specified format.
While sscanf()
is very useful, it can also open a door to security vulnerabilities. When using this function, you should be aware of the risk of buffer overflows and memory leaks when using the %n
specifier, and other important security best practices, which you can find at the end of this article.
Common Use Cases for the sscanf() Function
- Parsing Strings: One of the most common uses for
sscanf()
is parsing strings. Withsscanf()
, you can easily parse strings into individual words or phrases. This becomes particularly useful when you need to extract specific information from a larger string. For instance, let’s say you have a string that contains a date in the format ‘dd-mm-yyyy’. You can usesscanf()
to parse this string and extract the day, month, and year as separate integer variables. - Reading Formatted Data: Another popular use of the
sscanf()
function is to read formatted data from a string. This can be useful when the data you’re dealing with is structured in a certain way. For example, imagine you have a string that contains a list of names, each followed by their corresponding age. You can usesscanf()
to read this data and store the names and ages into separate variables. This allows you to manipulate and use the data in a more intuitive manner. - Converting String to Number: The
sscanf()
function also comes in handy when you need to convert a string to a number. This is often the case when you’re dealing with user input, as most input methods will provide the data as a string. Let’s say you’re developing a calculator application and the user inputs a number as a string. You can usesscanf()
to convert this string into an integer or a float. - Reading Data from Buffer: Lastly, the
sscanf()
function is commonly used to read data from a buffer. This is especially useful in situations where you need to extract data from a larger chunk of information. For example, if you have a buffer that contains a packet of network data, you can usesscanf()
to read and interpret this data based on a specified format. This allows you to extract the information you need from the buffer, making it easier to analyze and manipulate the data.
How sscanf() Works: Syntax and Code Example
The syntax of sscanf()
is as follows:
int sscanf(const char *str, const char *format, ...);
Here, str is the string from which to read the data, format specifies the type and format of data to be read, and the ellipsis (…) represents the variables where the read data will be stored.
To see how this works, let’s look at a simple code example:
#include <stdio.h> int main() { char date[11] = "14-07-2023"; int day, month, year; sscanf(date, "%d-%d-%d", &day, &month, &year); printf("Day: %d, Month: %d, Year: %d\n", day, month, year); return 0; }
In this code, we have a string date that contains a date in the ‘DD-MM-YYYY’ format. We use sscanf()
to parse this string and extract the day, month, and year as separate integer variables. We then print these variables to the console.
The output of this code would be:
Day: 14, Month: 07, Year: 2023
As you can see, the sscanf()
C function provides a versatile way to read and interpret data from a string. You can similarly use it to parse strings, read formatted data, convert strings to numbers, or read data from a buffer.
sscanf Alternatives
There are several other options you can choose, instead of using the sscanf()
C function. These alternatives come with their own pros and cons, and in some cases, might be more secure than sscanf()
.
strtok() Function
The syntax of strtok()
is as follows:
char *strtok(char *str, const char *delims);
This function works by breaking a string into tokens, hence its name. It’s particularly handy when you need to parse a string into smaller chunks based on specific delimiters.
Pros:
- No risk of buffer overflow:
strtok()
doesn’t suffer from the same risk of buffer overflows assscanf()
, reducing the risk of unexpected input. - Simplicity:
strtok()
is easy to use and provides the ability to use multiple delimiters.
Cons:
- String mutation:
strtok()
changes the original string, which can lead to unintended side effects if the string is expected to remain constant. - Non-thread safe:
strtok()
is not thread-safe, which means it could cause concurrency issues when used in multi-threaded applications.
atoi(), atol(), atof() Functions
The syntax of these functions is as follows:
long int atol ( const char * str ); long long int atoll ( const char * str ); double atof ( const char * str );
The atoi()
, atol()
, and atof()
functions are also alternatives to sscanf()
. These functions convert a string to an integer (atoi), a long integer (atol), or a floating-point number (atof).
Pros:
- Simplicity: These functions are straightforward to use, making them less prone to misuse and related security risks.
Cons:
- Lack of error checking: These functions lack robust error checking, meaning they can silently fail or produce incorrect results without any clear indication of an error. This can lead to unexpected behavior, potentially introducing security vulnerabilities.
- Return zero: These functions return zero when they can’t convert the string, which might be mistakenly considered as valid output and not an error.
strtol(), strtod() Functions
The syntax of these functions is as follows:
long int strtol(const char *str, char **endptr, int base) double strtod(const char *str, char **endptr)
Another set of functions that can serve as alternatives to sscanf()
are strtol()
and strtod()
. These are similar to atoi()
and atof()
, respectively, but they provide more robust error checking.
The strtol()
function converts a string to a long integer, while the strtod()
function converts a string to a double-precision floating-point number.
Pros:
- Error checking: These functions provide more robust error checking than
atoi()
,atol()
, andatof()
. This makes them less likely to silently fail, reducing the potential for unexpected behavior and associated security vulnerabilities. - Better overflow handling: They handle numeric overflows and underflows better than their simpler counterparts, reducing the risk of buffer overflow attacks.
Cons:
- Complexity: The added complexity of these functions could lead to misuse if not understood properly.
- Buffer overflow risk: Like
sscanf()
, misuse can still result in buffer overflow errors, so appropriate precautions should be taken.
Security Best Practices
While the sscanf()
C function is flexible and useful, it’s also susceptible to various security issues if not used properly. Here are some best practices to keep in mind to ensure secure use of sscanf()
.
- Check the Return Value: The return value of
sscanf()
indicates how many items were successfully read. If this number is less than the number of items you expected to read, it might indicate an error. By checking the return value, you can ensure thatsscanf()
has successfully read the data. This practice can prevent many potential issues and bugs later on. - Be Cautious with the %n Specifier: The
%n
specifier insscanf()
can be a significant security risk. This specifier doesn’t consume any input; instead, it writes the number of characters read so far into the corresponding argument. The danger with%n
is that it can lead to a write-what-where condition, which is a severe security vulnerability. Therefore, it’s vital to be cautious when using the %n specifier and avoid it altogether if possible. - Initialize Your Variables: Uninitialized variables can lead to undefined behavior, which can cause a variety of issues, from incorrect program output to crashes. By initializing your variables, you ensure that they have a known value before
sscanf()
reads into them. This practice makes your code safer and more predictable.
Deterministic Security for IoT
The vast majority of IoT/embedded devices use C code and are prone to related memory and code vulnerabilities, including vulnerabilities stemming from improper use of sscanf
or similar functions.
Sternum’s patented EIV™ (embedded integrity verification) technology protects from these with runtime (RASP-like) protection that deterministically prevents all memory and code manipulation attempts, offering blanket protection from a broad range of software weaknesses (CWEs).
Embedding itself directly in the firmware code, EIV™ is agentless and connection agnostic. Operating at the bytecode level, it is also universally compatible with any IoT device or operating system (RTOS, Linux, OpenWrt, Zephyr, Micirum, FreeRTOS, etc.) and has a low overhead of only 1-3%, even on legacy devices.
The runtime protection features are also augmented by (XDR-like) threat detection capabilities of Sternum’s Cloud platform, its AI-powered anomaly detection, and extended monitoring capabilities.
To learn more, check out these case studies of how this technology was used to:
– Help a Fortune 500 company catch memory leaks in pre-production (Zephyr device)
– Uncover buffer overflow vulnerabilities in 80,000 NAS devices
– Uncover buffer overflow vulnerability in a very popular smart plug device
Also check out the video below to see Sternum EIV™ in action, as it provides out-of-the-box mitigation of Ripple20 malware, used for memory corruption attacks.