Python: Working with RPMs

In this post I’ll cover:

  • Using the dnf module for querying for RPMs
  • Exctract RPM details with re module
  • Comparing RPM versions

The complete code is included at the end.

Query for RPMs

In this part I’ll show you how to use the DNF module in order to search for RPMs and extract information on them such as arch, version and name.

The DNF module

We will use the DNF python module through this post.  It’s a great module for working with RPMs and includes lots of improvements when compared to the previous YUM python module.

Let’s start by importing it and create the dnf.Base instance.

import dnf

base = dnf.Base

dnf.Base is the main component of the DNF module. You will use it for almost every operation (query, install, upgrade, etc) which often means you will only create one instance of it for all your app’s operations.

The dnf.Base is a stateful object, which means that the result of some of its methods is affected by previous calls.

Read the repositories

Next, we’ll read the all the repositories defined ‘dnf.conf.Conf.reposdir.default’

> print(dnf.conf.Conf.reposdir.default)
[u'/etc/yum.repos.d', u'/etc/yum/repos.d', u'/etc/distro.repos.d']

We’ll do it by using the base instance, we created earlier. It will read all the files ending with the .repo suffix  from all the locations in the above list.


At this point you should be able to list all the available repositories on your system


{u'updates-source': <Repo updates-source>, u'rpmfusion-free-updates-testing': <Repo rpmfusion-free-updates-testing>}
Query for packages

For querying, we would need to use base.sack, which contains all the metadata information on installed and available packages.

You should initialize base.sack before starting to query or run operations on packages with the dnf module. You can do it, by running the following line


Next, we should create a query instance, which we’ll use to query the packages, using specific criteria.

sack_query = base.sack.query()

We can narrow its scope by setting it to only available RPMs and not installed

avail_rpms = sack_query.available()

We can also filter it by specifying a specific string. Let’s say we want to list all the packages that contain the substring ‘zlib’

zlib_rpms = avail_rpms.filter(name__substr='zlib')

You can then iterate through all the found packages (available & contain the string ‘zlib)

for pkg in zlib_pkgs:


Extract RPM attributes using the re module

I won’t cover re here as it deserves a separate post (or even several posts), but I will show you how to use it on strings such as ‘lib-1.2.8-10.fc28.x86_64’ to extract the bits you are interested in.

Extract the name

To extract the name from an RPM string, use the following line

name ='(^[a-zA-Z0-9-]*)-d', rpm_str)

First part of the regular expression is ^[a-zA-Z0-9-] which means search for anything that begins with a letter (lower or upper) or a number up until you reach an hyphen sign (‘-‘). So it can match strings such as ‘python-‘, ‘test-‘ 10x-‘.

Since some packages are composed of two words (e.g. python2-ipaddr, python-webob) we also add the asterisk sign. So now, using (^[a-zA-Z0-9-]*) we would match everything after the hyphen sign.

To limit the match to the actual package name, we need to search for what comes after the package name and that’s a hyphen followed by the version number: ‘-d’.

Now that our expression is complete, we can access the match using the group attribute

print( # Result might be: python-warlock, zlib, etc.
Extract the version

Extracting the version depends on whether you want the long or short version. Assuming short, you would want to use the following expression

version ='((d+.)+d+)', rpm_string)

The first part is (d+.)+ which means one number or more, followed by one dot (e.g. 123. ,     4.   ,     24.   ,    9251924. ) and this expression by itself should be repeated one or more times (e.g. ,     1. ,     95.24.52).

Next part is just the closing part of the expression and is one number or more: ‘d+’.

The complete expression match would be: 1.2.3,     6.5,   2.54.3451.5868.234.1

Once again, to access it use the group attribute
Extract architecture & operating system

I’m sure you already got the point from the previous examples, so we’ll make it shorter.

For the OS (operating system) use this expression

os ='.([a-z]+[0-9]{1,2})', rpm_string)

Note: your OS may not include the major version number. in such case simply change {1,2} for {0,2}.

For the arch (architecture) use this expression

arch ='.([a-zA-Z]+)$', rpm_string)

Once again, use group(1) to access the match. More about re module here.

Compare RPM versions

Types of version

For comparing RPM versions, we’ll use disutils.version which delivers exactly what we need

It has two types of version comparing:

  1. Strict – two or three dots separation. Last part can contain ‘a’ or ‘b’ followed by a number (a.k.a ‘pre-release’ tag).  Examples: 1.2, 3.2.3, 8.3b2, 5.3.3a5
  1. Loose – dot or letter separation. The comparison is numerically and alphabetically lexically. Examples: 2g5, 12, 5.1h3, 9.45ba.33

For RPM version comparison, we’ll use the strict version as RPM versions are strict and don’t include letters.

Compare versions

Remember we extracted the RPM version earlier using the re module? good. We’ll use it now for the comparison.

from distutils.version import StrictVersion

def is_ge(rpm_ver1, rpm_ver2):
    return StrictVersion(rpm_ver1) >= StrictVersion(rpm_ver2)

is_ge('1.2', '0.4') # False
is_ge('2.2.4', '1.7') # True

This is pretty basic. We created a function to compare two versions. Specifically checking if the first version is greater or equal to the second given version.

One (quite big) issue with the current implementation: it’s not dynamic enough. It’s a very specific comparison (‘>=’). What about scenarios in which I get the comparison type from the user? also, what if I get it in the form of  ‘>,  <,   ==,   >=,   etc’ ?

We’ll see how to deal with that in the following section.

Comparing version to python requirement specification

You might get dependency specification such as ‘>=1.3, !=2’. Similar to how python requirements files look like.

# An example for requirements.txt file


In such case you may want to create an operators map variable to hold all the literal  comparison statement, so you can convert them to the rich comparison methods names (you’ll see why in a second)

op_map = { 
    '==': 'eq', '=':  'eq', 'eq': 'eq',
    '<':  'lt', 'lt': 'lt',
    '<=': 'le', 'le': 'le',
    '>':  'gt', 'gt': 'gt',
    '>=': 'ge', 'ge': 'ge',
    '!=': 'ne', '<>': 'ne', 'ne': 'ne'

You can then use it to check if an RPM version (e.g 1.2.3) meets a version specification (e.g. ‘>=2’)

import operator

op = op_map[chosen_op]
cmp_method = getattr(operator, op)

return cmp_method(StrictVersion(version1), StrictVersion(version2))

As a first step, we imported the operator module,  which is part of the Python standard library and includes the functions for different operators and for comparing objects. I recommend to read more about it here.

We then extracted from our op_map variable, the name of the comparison method according to the provided/chosen operator (chosen_op). So op might be   “gt”   if ‘>’ was chosen.

Next we extract the comparison method from the operator module, using the ‘op’ variable. If op is ‘gt’ then ‘cmp_method’ is the gt comparison method of the operator module.

The final step is to compare, using the ‘cmp_method’ which is now one of the comparison methods of the operator module.

Compare additional attributes

Not much to say here as any other attributes you might want to compare should be done easily by ‘return name1 == name2’. No tricks as with versions comparison.

The complete code

I know it can be annoying to assemble all the parts here by yourself, so I gathered all the lines scattered through this post into one continuous block.

from distutils.version import StrictVersion
import dnf
import operator

# Assumptions:
#   - chosen_op -> provided by the user/external source

packages = {}

op_map = { 
    '==': 'eq', '=':  'eq', 'eq': 'eq',
    '<':  'lt', 'lt': 'lt',
    '<=': 'le', 'le': 'le',
    '>':  'gt', 'gt': 'gt',
    '>=': 'ge', 'ge': 'ge',
    '!=': 'ne', '<>': 'ne', 'ne': 'ne'

# Setup
base = dnf.Base()

# Query
sack_query = base.sack.query()
avail_rpms = sack_query.available()
zlib_rpms = avail_rpms.filter(name__substr='zlib')

for pkg in zlib_pkgs:
    name ='(^[a-zA-Z0-9-]*)-d', pkg)
    version ='((d+.)+d+)', pkg)
    os ='.([a-z]+[0-9]{1,2})', pkg)
    arch ='.([a-zA-Z]+)$', pkg)

    packages[name] = {'version': version}
    packages[name]['os'] = os
    packages[name]['arch'] = arch

op = op_map[chosen_op]
cmp_method = getattr(operator, op)

version_zlib = packages['zlib']['version']
version_qlib = packages['qlib']['versiob']

print(cmp_method(StrictVersion(version_zlib), StrictVersion(version_qlib)))

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s