Linux USA 05.2024
Linux USA 05.2024
FR D
+
DV
EE
barcodes and QR codes
10
APPLAUDABLE
Build your own
APPS FOR YOUR
mail client
LINUX TOOLKIT
W W W. L I N U X - M A G A Z I N E . C O M
EDITORIAL
Welcome
Info
[1] “Laid-Off Techies Face ‘Sense of Impending Doom’ with Job
Cuts at the Highest Since Dot-Com Crash” by Alex Koller,
CNBC, March 15, 2024: https://www.cnbc.com/2024/03/15/laid-
off-techies-struggle-to-find-jobs-with-cuts-at-highest-since-
2001.html
[2] “Vermont Senator Bernie Sanders Introduces Four-Day Work-
week Bill” by Martin Pengelly, The Guardian, March 15, 2024:
https://www.theguardian.com/us-news/2024/mar/15/bernie-
sanders-four-day-workweek
[3] “Mercedes Is Trialing Humanoid Robots for Low-Skill Repeti-
tive Tasks” by Jess Weatherbed, The Verge, March 15, 2024:
https://www.theverge.com/2024/3/15/24101791/mercedes-
robot-humanoid-apptronik-apollo-manufacturing
ON THE COVER
32 Attacking SSH 52 Go Mail Client
Learn about some tools intruders use to attack The best way to learn about IMAP is to
SSH servers – and what you can do to protect build your own email client.
your system.
64 Jellyfin
38 Pivot Tables Stream videos without a cloud connection.
Sort, group, and process spreadsheet data
for easy viewing. 90 Zint
You are probably already imagining
48 Bash PaLM Shell
reasons why you’d want to create your own
Access the Google Pathways AI system from
QR codes.
inside a terminal window.
NEWS IN-DEPTH
8 News 26 Data Management
• Zorin OS 17.1 Released, Includes Improved Windows Open source database management systems offer greater
App Support flexibility and lower costs while avoiding vendor lock-in.
• Linux Market Share Surpasses 4% for the First Time Finding the right one depends on your project’s needs.
• KDE’s Plasma 6 Officially Available
• Latest Version of Tails Unleashed 32 Attacking SSH
• KDE Announces New Slimbook V with Plenty of Power Sometimes the only way to break into an SSH server is
• Monthly Sponsorship Includes Early Access to through brute force – and yes, there are tools for that.
elementary OS 8
• DebConf24 to Be Held in South Korea 38 LibreOffice Calc Pivot Tables
Pivot tables let you sort, rearrange, group, and perform
12 Kernel News calculations on your spreadsheet data. We help you get
• Improving Web Browser Security started with this powerful tool.
78 pdfsandwich
MakerSpace Use this handy tool to make your scanned PDF files
zoomable and searchable.
@linux_pro
TWO TERRIFIC DISTROS
@linuxpromagazine
DOUBLE-SIDED DVD!
Linux Magazine SEE PAGE 6 FOR DETAILS
@linuxmagazine
dark mode, Gnome’s Night Light, and everything else that came along with that
particular Gnome release.
You also will find that Tails 6.0 includes a new auto-mount feature that supports
encrypted devices and protection against malicious USB devices.
As far as applications are concerned, Tails includes the likes of KeePassXC 2.7.4,
Inkscape 1.2.2, Audacity 3.2.4, Gimp 2.10.34, Tor Browser 13.0.10, Metadata
Cleaner 2.4.0, and even a release of Thunderbird that supports Gmail accounts as
well as the Gnome Text Editor.
To find out more about Tails 6.0, read the official release notes (https://tails.net/
news/version_6.0/index.en.html) and find out how to install Tails from the official
Install Tails page (https://tails.net/install/index.en.html ).
QQQ
NEWS
Kernel News
Zack’s
Kernel News
Improving Web Browser malware from redirecting code execu-
Security tion from the legitimate program to the
Jeff Xu submitted a security patch on malware code itself. The mseal() patch
behalf of the Chrome browser project. was intended to be part of Chrome’s
In general, browsers are among the larger scale CFI efforts.
most security-intensive software proj- Jeff gave a more in-depth explanation,
ects out there. Browsing the web in- saying:
volves directly running a whole lot of “The Chrome browser has very specific
code from all over the Internet locally requirements for sealing, which are dis-
on your computer. Bad actors abound. tinct from those of most applications.
Protecting users is one of the key essen- For example, in the case of libc, sealing
tial features of any web browser. is only applied to read-only (RO) or
In this case, Jeff proposed a new read-execute (RX) memory segments
mseal() system call, which would allow (such as .text and .RELRO) to prevent
Chronicler Zack Brown reports Chrome to “seal” regions of memory them from becoming writable, the life-
against modification. Jeff’s patch also time of those mappings are tied to the
on the latest news, views, included some changes to the generic lifetime of the process.
dilemmas, and developments mmap() system call to add the PROT_SEAL “Chrome wants to seal two large ad-
within the Linux kernel and MAP_SEALABLE bits to the mmap() flags. dress space reservations that are man-
These on/off bits would tell whether a aged by different allocators. The memory
community. region had been sealed or was available is mapped RW- and RWX respectively but
By Zack Brown to be sealed. write access to it is restricted using pkeys
As Jeff explained it, “Memory seal- […]. The lifetime of those mappings are
ing is useful to mitigate memory cor- not tied to the lifetime of the process,
ruption issues where a corrupted therefore, while the memory is sealed, the
pointer is passed to a memory manage- allocators still need to free or discard the
ment system. For example, such an at- unused memory. For example, with
tacker primitive can break control-flow madvise(DONTNEED).
integrity guarantees since read-only “However, always allowing
memory that is supposed to be trusted madvise(DONTNEED) on this range
can become writable or .text pages can poses a security risk. For example if a
get remapped. Memory sealing can au- jump instruction crosses a page boundary
tomatically be applied by the runtime and the second page gets discarded, it
loader to seal .text and rodata [read- will overwrite the target bytes with zeros
only data] pages and applications can and change the control flow. Checking
additionally seal security critical data write-permission before the discard oper-
Author at runtime.” ation allows us to control when the oper-
The Linux kernel mailing list comprises Jeff also mentioned (which he may ation is valid. In this case, the madvise
the core of Linux development activities. have later regretted) that OpenBSD al- will only succeed if the executing thread
Traffic volumes are immense, often ready had a similar feature in its mimmu- has PKEY write permissions and PKRU
reaching 10,000 messages in a week, and table() system call. changes are protected in software by con-
keeping up to date with the entire scope Jeff’s mseal() patch’s specific value trol-flow integrity.
of development is a virtually impossible for Chrome was to support its Control “Although the initial version of this
task for one person. One of the few brave Flow Integrity (CFI) features. CFI refers patch series is targeting the Chrome
souls to take on this task is Zack Brown. to the general goal of preventing browser as its first user, it became
evident during upstream discussions that MAP_SEALABLE. So the stack must The key discussions ran from December
we would also want to ensure that the remain unsealed.” 12, 2023 to January 9, 2024. Early in the
patch set eventually is a complete solu- He went on to say, “The only answer discussion, Linus had made some rec-
tion for memory sealing and compatible to these problems will be to always set ommendations, saying (as quoted by
with other use cases.” MAP_SEALABLE. To go through the en- Jeff), “Particularly for new system calls
Liam R. Howlett had some concerns tire Linux ecosystem, and change every with fairly specialized use, I think it’s
about Jeff’s last point regarding turning call to mmap() to use this new MAP_ very important that the semantics are
this patch into a generally useful feature SEALABLE flag […]. Every single one sensible on a conceptual level, and that
for everyone. He remarked, “The best of them, and you’ll need to do it in the we do not add system calls that are
plan to ensure it is a general safety mea- kernel.” based on ‘random implementation issue
sure for all of linux is to work with the Theo concluded, “If you had spent a of the day’.”
community before it lands upstream. It’s second trying to make this work in a Jeff said he sent out a patch based on
much harder to change functionality second piece of software, you would all of Linus’s earlier recommendations.
provided to users after it is upstream.” have realized that the ONLY way this He had kept MAP_SEALABLE because no
And he added, “I am very concerned this could work is by adding a flag with the one had commented on it as of his pre-
feature will land and have to be main- opposite meaning: MAP_NOTSEAL- vious patch submissions.
tained by the core mm people for the ABLE. But nothing will use that. I On January 4, Jeff said, Linus had
one user it was specifically targeting.” promise you.” replied to that day’s version of Jeff’s
Liam also asked to see some bench- Liam also replied, reiterating some of patch, saying, “Other than that, this
marks to see how much this feature his concerns that Jeff’s code might end seems all reasonable to me now.”
might slow down the system as a whole. up having to be maintained by kernel Which Jeff interpreted to mean (as Jeff
Jeff attempted to address some of Liam’s people for the sole use of Chrome, rather put it), “Linus is OK with the general
concerns (more on this coming up) and than as a general-purpose security mea- signatures of the APIs.”
added, “I would love to hear more from sure. He remarked that Jeff seemed to be Then on January 9, Kees Cook had
Linux developers on this.” rushing forward with this patch, remov- been the one to suggest dropping the
The irascible Theo de Raadt from the ing the Request For Comments (RFC) no- “RFC” notation, “given Linus’s general
OpenBSD project also had some com- tation, and in fact trying to implement approval.”
ments, which he prefaced by saying to more than Linus Torvalds or other big- In other words, Jeff said, he had not
Jeff, “I’m not sure you are capable of time kernel hackers had suggested in the been rushing forward as Liam feared,
listening. But I’ll repeat for others to stop past. As he put it, “I think there are but had only been moving at a pace sug-
this train wreck.” some unanswered questions and that’s gested or at least implied by Linus and
Theo pointed out that when the Linux frustrating some people as you may not others. He added, “That said, I’m open
kernel used the execve() syscall to run be valuing the experience they have in to feedback from Linux developers.”
the Chrome browser or any other pro- this area.” However, to the notion that Linus had
gram, it did not mark the various data- He offered a lot of specific technical approved the APIs, Theo retorted,
centric regions of memory as “sealable,” suggestions and questions for Jeff to “Linus, you are in for a shock when the
and for good reason. Other security fea- consider. proposal doesn’t work for glibc and all
tures, such as Relocation Read-Only Theo posted another reply, offering his the applications!”
(RELRO) and Indirect Functions (IFUNC) own pile of technical suggestions. Theo At this point Linus came into the con-
wouldn’t work in that case. These fea- also pointed out that as someone primar- versation, ribbing Theo for his acerbic
tures are used to identify function ad- ily in the OpenBSD world, Linux devel- style.
dresses and specific function implemen- opment in this regard seemed strange. Linus said:
tations at runtime, when the code has As he put it, “Two sub-features are being “Heh. I’ve enjoyed seeing your argu-
enough information about the system to pushed very hard, and the primary de- mentative style that made you so famous
choose the best implementation of a veloper doesn’t have code which uses ei- back in the days. Maybe it’s always been
given function. Sealing up that memory ther of them. And once it goes in, it can- there, but I haven’t seen the BSD people
before those identifications could take not be changed. It’s very different from in so long that I’d forgotten all about it.
place would make those security mea- my world, where the absolutely minimal “That said, famously argumentative or
sures unavailable. interface was written to apply to a whole not, I think Theo is right, and I do think
As Theo put it, “You think you can operating system plus 10,000+ applica- the MAP_SEALABLE bit is nonsensical.
seal the stack in the kernel?? Sorry to be tions, and then took months of testing “If somebody wants to mseal() a
the bearer of bad news, but glibc has before it was approved for inclusion. memory region, why would they need
code which on occasion will mprotects And if it was subtly wrong, we would be to express that ahead of time?
the stack executable. But if userland de- able to change it.” “So the part I think is sane is the
cides that mprotect case won’t occur – Meanwhile Jeff replied to all of this, mseal() system call itself, in that it al-
how does a userland program seal its clarifying the chain of events that led lows *potential* future expansion of
stack? It is now too late to set up to his patch and its specific features. the semantics.
“But hopefully said future expansion “So the problem with this whole patch that a ‘write()’ system call is one atomic
isn’t even needed, and all users want the series from the very beginning was that it thing for regular files, and some people
base experience, which is why I think was very specialized, and COMPLETELY think that means that you see all or noth-
PROT_SEAL (both to mmap and to mpro- OVER-ENGINEERED. ing. That’s simply not true, and you’ll
tect) makes sense as an alternative form. “It got simpler at one point. And then see the write progress in various indirect
“So yes, to my mind you started adding these features that ways (look at intermediate file size with
have absolutely no reason for them. ‘stat’, look at intermediate contents with
mprotect(addr, len, PROT_READ); Again. ‘mmap’ etc etc).
mseal(addr, len, 0); “It’s frustrating. And it’s not making it “So I agree that atomicity is something
more likely to be ever merged.” that people should always *strive* for,
“should basically give identical results to Jeff replied, “I’m sorry for over- but it’s not some kind of final truth or
thinking. Remove the MAP_SEALABLE absolute requirement.
mprotect(addr, len, PROT_READ | U it is then.” “In the specific case of mseal(), I sus-
PROT_SEAL); Greg Kroah-Hartman remarked at a pect there are very few reasons ever
certain point, “There’s no rush here, and *not* to be atomic, so in this particular
“and using PROT_SEAL at mmap() time no deadlines in kernel development. If context atomicity is likely always some-
is similarly the same obvious notion of you don’t have a working userspace user thing that should be guaranteed. But I
‘map this, and then seal that mapping’. for your new feature(s), there is no way just wanted to point out that it’s most
“The reason for having ‘mseal()’ as a we can accept the changes to the kernel definitely not a black-and-white issue in
separate call at all from the PROT_SEAL (and hint, you don’t want us to the general case.”
bit is that it does allow possible future ex- either…).” To which Jeff replied, “Thanks. At
pansion (while PROT_SEAL is just a sin- The technical discussion continued least I got this part done right for
gle bit, and it won’t change semantics) with vim and verve, focused on imple- mseal() :-).”
but also so that you can do whatever menting mseal() alone, at this point. An The discussion ended after some fur-
prep-work in stages if you want to, and interesting interaction occurred when ther technical delving and debate, with
then just go ‘now we seal it all’.” Liam mentioned, “What I’m more con- no absolute consensus on one patch or
A big flurry of technical suggestions cerned about is what happens if you call the other. But it does seem that Linus
and debate followed this direction from mseal() on a range and it can mseal a gave a lot of clarity to what he wanted
Linus. Theo was unsatisfied and pointed portion. Like, what happens to the first and what he would accept. And the
out various problems that he felt would vma in your test_seal_unmapped_mid- fact that he’d comment at all is an in-
be involved in implementing what Linus dle case? I see it returns an error, but is dication that he’s generally in favor of
had suggested. the first VMA mseal()’ed? (no it’s not, the overall feature, if it could be imple-
While all the technical back-and-forth but test that).” mented well and if various other con-
was going on, Jeff also pointed out: To which Theo replied, “That is cor- cerns, such as identifying actual users
“I like to look at things from the point rect, Liam. Unix system calls must be beyond Chrome alone, could be
of view of average Linux userspace devel- atomic.” addressed.
opers, they might not have the same level To that, however, Linus returned to It’s not at all uncommon to see a big
of expertise as the other folks on this the conversation, saying: organization like Google push for a
email list or they might not have time “That’s actually not true, and never particular kernel feature they want for a
and mileage for those details. has been. single piece of software. Database com-
“To me, the most important thing is to “It’s a good thing to aim for, but sev- panies like Oracle, for example, have
deliver a feature that’s easy to use and eral errors means ‘some or all may definitely been known to try to get fea-
works well. I don’t want users to mess have been done’. tures into the kernel that would speed
things up, so if I’m the one giving them “EFAULT (for various system calls), up or otherwise benefit their product.
the tools, I’m going to make sure they ENOMEM and other errors are all things Linus’s attitude in such cases seems to
have all the information they need and that can happen after some of the system be that if there is really a value to ker-
that there are safeguards in place.” call has already been done, and the rest nel users in general, then such a feature
He went on to point out some poten- failed. might be welcome. If there’s no general
tial use cases that might involve the “There are lots of examples, but to pick value, and especially if there’s a speed
application developer in a lot of unex- one obvious VM example, something like penalty for the system as a whole, that’s
pected complications. mlock() may well return an error after another story.
He added, on another practical note, “If the area has been successfully locked, but In the case of Jeff’s security patches, it
the decision of MAP_SEALABLE depends then the population of said pages failed does seem as though Linus and other big
on the glibc case being able to use it, we for some reason. time kernel people have been involved
can develop such a patch, but it will take “Of course, implementations can differ, roughly since the beginning, and the
a while, say a few weeks to months, due and POSIX sometimes has insane lan- only serious opposition to any of Jeff’s
to vacation, work load, etc.” guage that is actively incorrect. proposed features has been on technical
Eventually Linus returned with a bit “Furthermore, the definition of ‘atomic’ implementation grounds, rather than on
more critique and direction, saying: is unclear. For example, POSIX claims the actual benefits they offer. Q Q Q
By Michael Williams
P
rograms you run on your Linux desktop often have to tiring details of sockets. This article will introduce you to
communicate with each other. For example, when you D-Bus and explore some practical examples showing D-Bus
open a text file by clicking on it in your file manager, at work in the Linux environment.
your file manager launches your preferred text editor. D-Bus is now so widely used by so many programs that it is
However, if you already have open an instance of your text edi- virtually guaranteed to already be installed on your Linux sys-
tor, the text editor program as launched by your file manager tem. Even the init daemon on most Linux distributions, sys-
most likely will request the open instance to open the file in a temd [3], uses D-Bus to facilitate communication between its
new tab; then the instance invoked by the file manager will components and higher-level application programs. As such, I
exit. As another example, many laptop computer keyboards will not bother covering the topic of how to install D-Bus on
have a set of extra keys on or next to the main keyboard for your system. Some of the examples I give later in this article
controlling multimedia playback. When pressed, these keys ac- are written in the Python [4] programming language and re-
tivate a daemon that is part of your desktop environment, and quire the dbus Python module [5]. Some of the later, more com-
this daemon in turn passes the message on to the most recently plex examples I give also require the GLib [6] library, even
opened media player application. though I tried to reduce the number of library dependencies
Unix-like operating systems have a number of ways for pro- you would need. These Python modules are not guaranteed to
cesses to communicate between each other. Aside from simply be pre-installed on your installation. On Debian, Ubuntu, and
passing information in regular files, the most familiar form of their derivatives, you can ensure that you have them installed
inter-process communication (IPC) is the pipe. Pipes are rela- using the command
tively easy to use, but they are only unidirectional, which
means a program can send or receive data but cannot do one sudo apt install python3-dbus python3-gi gir1.2-glib-2.0
MPRIS
The Media Player Remote Interfacing Spec- multimedia application before controlling the media player are already running, a
ification (MPRIS) is a protocol that lets one it can be inconvenient, and the multime- unique instance identifier (usually the
program control the playback of audio or dia keys have thus remained on laptop media player’s process ID) is appended to
video in another program. There are com- keyboards. the name as well.
mands to play, pause, stop, and seek the Typically, controlling media playback with Once the daemon compiles a list of the
playback, as well as commands to switch to the media keys is a two-step process. bus names of the running media players,
other “tracks” or multimedia files. Some daemon that is part of your desk- the daemon must choose the media
The primary, original purpose of MPRIS top environment intercepts, or grabs, player to which it will send the appropri-
was for the implementation of the special each of the media keys, so that when a ate command (play/pause, seek, etc.).
media keys found to the side of many lap- media key is pressed, the daemon is noti-
Usually, the daemon chooses the media
top computer keyboards (or sometimes fied (rather than the default behavior,
player that was most recently opened,
integrated into the main keyboard itself). which is that the currently focused appli-
though in theory it could send the mes-
These keys were originally designed to cation receives the keypresses). Then, the
sage to a different media player, a ran-
control audio CD playback without first daemon requests the full list of registered
dom one in the list, or even all of them.
having to navigate to the window of the names from the D-Bus daemon and scans
CD-playing application. Even though the list for MPRIS-compliant media play- There is no requirement that every media
playing audio CDs is far less popular than ers; when a media player implements player application implement all MPRIS
it used to be, music playing applications MPRIS, it registers on the bus with a commands, but all of the commands I will
in general, as well as other multimedia name of the form org.mpris.MediaPlayer2 use in this article are basic enough that
applications, remain just as popular as followed by the name of the media player. they should be fairly widely supported no
ever. Having to focus the window for any Optionally, or if one or more instances of matter your media player of choice.
An important thing to note is that there can be more than Many major audio, video, and music playing applications on
one message bus on a running Linux system. In fact, there are Linux support an interface for remote-controlling the multi-
usually at least two: There is almost always a system message media playback using a protocol known as the Media Player
bus, which is accessible by all processes on the system, no Remote Interfacing Specification (MPRIS) [8]. Any media
matter which user owns the process. The system message bus player application that supports the MPRIS protocol can be
is usually used to communicate between user programs and controlled by another program using simple commands sent
system daemons. As an example, logind, a system daemon for via D-Bus. Even some programs that are not dedicated
managing user logins as well as allowing normal users to controlmedia players but are sometimes used to play multimedia
the power state of the computer (reboot, shutdown, suspend implement MPRIS. The Firefox and Chromium web browsers,
mode, etc.), uses the system message bus to communicate with as well as other web browsers, implement MPRIS because it
other programs. is common for web pages to include embedded videos and
In addition to the system message bus, usually another, pri- audio. (See the box entitled “MPRIS” for more about the
vate message bus is created for each graphical session associ- MPRIS protocol.)
ated with a user login. Each one of these message busses is As a first example, let me list all of the media players I am
known as a session bus. Each session bus is accessible by only currently running. An MPRIS-compliant media player identifies
the user who “owns” the bus, i.e., the user who started the itself on the message bus with a unique name that always
session bus in the first place. Two programs running in two dif-starts with org.mpris.MediaPlayer2. All one needs to do to dis-
ferent desktop sessions cannot communicate with each other cover all the running media players that implement MPRIS is to
over D-Bus unless they use the system message bus. query the D-Bus daemon for a list of all names registered on
Most of the examples in this article will be concerned with the bus, then search the list for names starting with org.mpris.
the session bus; the final example will use the system bus. MediaPlayer2. In Figure 1, I have launched a total of five in-
stances of four different media player programs. Listing 1
A Practical Example: Media Players displays the names of each program and instance registered
I am going to illustrate the basic concepts of D-Bus with a on the bus. Each media player’s window is marked in the
practical example: controlling media-playing applications. figure with a number corresponding to the respective line
number in the listing. Notice how the second instance of
Listing 1: Finding D-Bus Names VLC distinguishes itself from the first instance with the pro-
$ dbus-send --print-reply --dest=org.freedesktop.DBus / org.
cess ID appended to the end of its D-Bus name.
freedesktop.DBus.ListNames \ In Listing 1, I have used the dbus-send command to invoke a
| grep -Eo 'org.mpris.MediaPlayer2.[^"]+' | nl -ba -s\: -w1 standard method implemented by the D-Bus daemon itself
1:org.mpris.MediaPlayer2.vlc
(which holds the name org.freedesktop.DBus): the ListNames
method. The dbus-send command is a useful tool shipped by
2:org.mpris.MediaPlayer2.qmmp
default along with D-Bus itself. It is mainly intended for debug-
3:org.mpris.MediaPlayer2.firefox.instance3794
ging, though for the first few examples here I will be using it as
4:org.mpris.MediaPlayer2.io.github.celluloid_player.
a simple way to demonstrate the basic concepts of D-Bus.
Celluloid.instance-1
Note that I am running two instances of VLC at the same
5:org.mpris.MediaPlayer2.vlc.instance3689
time. Each instance has registered its own name on the bus.
The first instance of VLC (number 1 in
Listing 1) identifies itself simply as vlc;
the second instance (number 5) ap-
pended its process ID onto the end of its
D-Bus name to avoid clashing with the
other instance of VLC. Apparently VLC
prefers to register a simple, generic name
for itself if possible: The second instance
first tried to register itself as org.mpris.
MediaPlayer2.vlc, but the D-Bus daemon
replied that the name had already been
acquired by another program. The sec-
ond instance of VLC then tried again,
this time with a more unique name, and
apparently succeeded in registering that
name. Other applications, such as
Firefox (line 3), register a unique name
from the start, even if no other instance
of that application is or was running.
Figure 1: Five instances of four distinct media player programs: (1) VLC [9], I encourage you to try the command
(2) the QMMP [10] music player, (3) the Firefox web browser, (4) Cellu- in Listing 1 yourself while you have
loid [11], and (5) a second instance of VLC. one or more media players of your own
running. If, however, you are running one or more media parameters but returned an array of strings, each element of
players and yet the command in Listing 1 produces no out- the array containing a name registered on the bus.
put, that does not necessarily mean that your running pro- When a program wants to make available one or more
grams don’t support MPRIS. Some programs, such as the methods for other D-Bus clients to invoke, it registers the in-
Chromium web browser, do not register an MPRIS name on terface – complete with a listing of all the available methods,
the message bus unless they are actively playing audio or parameters, and return values – with the D-Bus daemon.
video. You might have to load up one or more videos before Often, a program will store this interface description in a sep-
the command will display anything, whereas many pro- arate data file outside of the program’s code; in other cases,
grams will register an MPRIS name as soon as you open the the interface is described in a format embedded in the code of
program. Other programs, such as QMMP (and sometimes the program itself.
even VLC), require that you explicitly enable a plug-in or Having mentioned input parameters to methods, I will show
add-on module for MPRIS support. If you have to enable you another example, this time using a method that accepts a
such a plug-in, you might also need to restart the applica- single parameter as its input. I am again going to command
tion before it registers on the message bus. QMMP, but this time I’ll tell it to skip backwards 10 seconds,
using the Seek method:
Controlling Media Players with D-Bus
Methods dbus-send --type=method_call --dest=U
Python because its syntax is relatively clean and it is simple official D-Bus interface directly to avoid showing bias towards
to use, even for those with no prior experience with it. I chose one high-level interface or another – and also to better illus-
the dbus Python module because it is the official, canonical trate the concepts behind D-Bus.
Python module for communicating via D-Bus. It is not the I will start with a very simple example, merely a version of
simplest interface to D-Bus available; other, easier-to-use in- the command in Listing 1 rewritten entirely in Python. As be-
terfaces exist, such as GDBus [12] (a part of the GLib library fore, it looks for and prints the name of each and every running
from the Gnome project) and Qt D-Bus [13] (a part of the Qt media player that implements MPRIS. In this Python version of
program libraries, best known for the Qt graphical user inter- the program, however, the ListNames D-Bus method returns a
face libraries). However, I have chosen not to use any of these Python list of strings. This program iterates over each item in
abstractions in this article, and I instead chose to use the the list directly, rather than having to deal with the pre-format-
ted output of dbus-send, as in Listing 1. This example works
Listing 2: Seeking Backwards equally well from the interactive Python shell as it does as a
01 while :; do
Python script.
02 dbus-send --type=method_call --dest=org.mpris.
Obviously, such a program by itself is not of much use. For
MediaPlayer2.qmmp /org/mpris/MediaPlayer2 \ one thing, this program does nothing that the command in
03 org.mpris.MediaPlayer2.Player.Seek int64:-100000
Listing 1 could not already do. For another, knowing the
D-Bus names of each running media player is useless unless
04 sleep .1
the names can then be used to communicate with one or
05 done
more of the media players. Listing 4 provides a more useful
example. Like the example in Listing 3, it enumerates every
Listing 3: D-Bus Names Python Script running MPRIS-compliant media player. But while the pro-
01 #!/usr/bin/env python3
gram in Listing 3 only prints the name of each media player,
02
the example in Listing 4 instead sends a command to each
media player, requesting that each media player immediately
03 import dbus
pause its playback. This example does so by invoking the
04
MPRIS method Pause.
05 bus = dbus.SessionBus ()
The example in Listing 4 might be useful if you have multi-
06 dbus_daemon = bus.get_object ("org.freedesktop.DBus", "/") ple media players actively playing media – for example, a
07 dbus_iface = dbus.Interface (dbus_daemon, music player in the background and a video playing in your
08 dbus_interface="org.freedesktop.DBus") web browser in the foreground – and you want to silence all of
09 the media at the same time – for example, if you suddenly re-
10 names = dbus_iface.ListNames () ceive a phone call. Alternatively, if you are feeling creative (or
11 for name in names:
you simply enjoy chaos), you can change the program in List-
ing 4 to call the Play method instead of Pause. This will cause
12 if name.startswith ("org.mpris.MediaPlayer2."):
all running media players to begin playback simultaneously. If
13 print (name)
you have open as many media players as I had open in
02 02
04 04
09 09
13 13
Figure 1, the result will (obviously) probably be a grossly inco- because user input is similarly unpredictable in nature. As
herent response. such, if your program already implements a graphical user
As a final simple example, the code in Listing 5 seeks all the interface, listening to D-Bus signals will not further compli-
open media backwards by 10 seconds. This example serves to cate your program by any significant amount. (See the box
demonstrate the passing of input parameters to D-Bus methods entitled “Asynchronous Programming” for more about asyn-
using the dbus Python module. However, unlike the very simi- chronous programming specifically.)
lar example shown earlier that used dbus-send, the dbus module Listing 6 contains a simple Python program to listen for a
is sophisticated enough to “introspect,” or determine the types signal and print a message in the terminal window when the
of each of the method’s input parameters on its own. Notice signal is received. This example prints the message when a
that in Listing 5 I have not specified that the input parameter to media player begins playback. It also prints the D-Bus name of
the Seek method is a signed 64-bit integer, because I had to the media player. Although the name is not useful in this exam-
specify explicitly in the earlier example with dbus-send. ple, because the name returned is not guaranteed to be a hu-
man-readable name such as org.mpris.MediaPlayer2.vlc, it will
More on Signals become useful in the next example.
As I mentioned earlier in this article, D-Bus clients can not only To end the program, press Ctrl-C in the terminal window
receive messages from other programs using methods; they can from which it is running. Figure 2 shows an example of this
also notify other programs of events that have occurred using program in action. The first message in the terminal window
signals. A program must be listening to a signal in order to be was printed when I first launched the media player (in this
notified. To do this, a program must register a listener function case VLC). I subsequently paused and resumed the playback
with the D-Bus daemon. Once the listener function is regis- twice. Notice that there are no messages printed that show I
tered, whenever the signal is “emitted,” the listener function paused the playback; I designed the example so that it would
will be called. only print when playback commenced, and the program filters
Unfortunately, signal handling is more complicated and a out and throws away all other events.
little messier than simply calling methods. For starters, the Listing 7 is even more interesting. Instead of printing when
program receiving the signal cannot be written in the tradi- media playback commences, it immediately commands the
tional programming model, as a series of steps to be exe- media player to seek backwards five seconds. This program
cuted one after another; signals can be sent at any time, and could be useful for a transcriptionist: If the transcriptionist
when they will be sent can be unpredictable. The traditional has to pause the audio frequently to catch up with what is
programming model is known as synchronous program- being said in the audio recording, it is often useful to rewind
ming; in order to receive signals, the program must be asyn- the audio by a few seconds to allow the transcriptionist time
chronous. Incidentally, asynchronous programming is also
used heavily in programming graphical user interfaces, Listing 6: Print with Playback
01 #!/usr/bin/env python3
Asynchronous Programming 02
board to type into the current text document, the user could 14 bus.add_signal_receiver (
for all three types of events, and the first one to occur will be 21 loop = GLib.MainLoop ()
18 path="/org/mpris/MediaPlayer2", 15 bus_name="org.freedesktop.login1",
19 dbus_interface="org.freedesktop.DBus.Properties", 16 path="/org/freedesktop/login1",
20 signal_name="PropertiesChanged", 17 dbus_interface="org.freedesktop.login1.Manager",
21 handler_function=signal_received, 18 signal_name="PrepareForSleep",
22 sender_keyword="name") 19 handler_function=signal_received)
23 20
25 loop.run () 22 loop.run ()
Before I extended the Clock applet, when the computer woke running media players. This could be useful to a transcrip-
up from sleep mode, the clock would sometimes take up to a tionist, because it could mean that the audio playback could
minute to update to the current time; until then, the clock be controlled with a few keypresses without ever having to
would display the time at which the computer was put to change focus between the text editor and the media player.
sleep. This could be confusing to the user, who expected that Obviously, controlling media players is only the tip of the
the clock would display the correct time no matter the cir- iceberg, and you could use these techniques to control other
cumstances. To solve this issue, I added code to the Clock applications as well.
applet to listen for the PrepareForSleep signal, much like the There are a few topics that I did not cover in this article.
code in Listing 8 does. Every time logind sent the Prepare- Most notably, I glossed over the subject of D-Bus properties,
ForSleep signal, my code would test the signal’s boolean though the examples in Listing 6 and Listing 7 did briefly use
parameter, and if it was false, my code would force an update them. Properties are essentially just program variables that can
of the clock display. be accessed using standard D-Bus methods. Once you under-
Some Linux distributions opt to use a similar daemon called stand how to use methods, there is nothing particularly special
ConsoleKit2 [16] instead of logind. If your distribution uses about D-Bus properties.
ConsoleKit2, the code in Listing 8 will not work. You will have For brevity, I also did not cover implementing custom D-Bus
to make some minor changes to the code to make it compati- services – I only explained how to use existing services pro-
ble. Specifically, the bus name for ConsoleKit2 is org.freedesk- vided by other programs. I omitted this subject because I
top.ConsoleKit instead of org.freedesktop.login1, the object wanted to demonstrate the concepts of D-Bus with real exam-
path is /org/freedesktop/ConsoleKit/Manager instead of /org/ ples that produce real, immediate, tangible results. If you want
freedesktop/login1, and the interface name is org.freedesktop. to learn more about D-Bus and its full capabilities, I strongly
ConsoleKit.Manager instead of org.freedesktop.login1.Manager. suggest reading the official D-Bus documentation [2]. Q Q Q
No other modifications need to be made to the code. For the
sake of completeness, I have included the ConsoleKit2-compati- Info
ble version of the code in Listing 9. [1] X Window System: https://www.x.org/wiki/
[2] D-Bus: https://www.freedesktop.org/wiki/Software/dbus/
Conclusion
[3] systemd: https://systemd.io/
In this article, I have covered the basics of inter-process com-
munication using D-Bus. I have given a number of examples [4] Python programming language: https://www.python.org/
showing how to issue commands to other programs and also [5] dbus Python module:
how to wait for and receive messages back from those pro- https://dbus.freedesktop.org/doc/dbus-python/
grams. The material in this article could be combined to cre- [6] GLib: https://docs.gtk.org/glib/
ate, for example, a plug-in to a text editor to integrate with [7] Gnome desktop environment: https://www.gnome.org
[8] Media Player Remote Interfacing Specification (MPRIS):
Listing 9: Print with Wake Up (ConsoleKit2)
https://specifications.freedesktop.org/mpris-spec/latest/
01 #!/usr/bin/env python3
[9] VLC: https://www.videolan.org/vlc/
02
[10] QMMP: https://qmmp.ylsoftware.com/
03 import dbus
[11] Celluloid: https://github.com/celluloid-player/celluloid
04 from dbus.mainloop.glib import DBusGMainLoop
08 if not entering_sleep:
[14] logind: https://www.freedesktop.org/software/systemd/man/
latest/org.freedesktop.login1.html
09 print ("Computer just resumed from sleep mode")
10
[15] MATE Panel Clock applet: Update the displayed time when
the computer wakes up:
11 DBusGMainLoop (set_as_default=True)
https://github.com/mate-desktop/mate-panel/pull/1361
12
[16] ConsoleKit2: https://consolekit2.github.io/ConsoleKit2/
13 bus = dbus.SystemBus ()
14 bus.add_signal_receiver (
Author
15 bus_name="org.freedesktop.ConsoleKit",
Michael Williams, better known by his pseudonym Gordon
16 path="/org/freedesktop/ConsoleKit/Manager", Squash, is a freelance open source software developer. He is a
17 dbus_interface="org.freedesktop.ConsoleKit.Manager", member of the Core Developers Team of the MATE Desktop En-
18 signal_name="PrepareForSleep", vironment project (https://mate-desktop.org/). He enjoys hacking
19 handler_function=signal_received) anything related to the GTK+ GUI widget toolkit and continues
20
to develop a fork of GTK+ called STLWRT (https://github.com/
thesquash/stlwrt) when time permits. You can see some of the
21 loop = GLib.MainLoop ()
other projects he is currently up to at his personal GitHub page,
22 loop.run ()
https://github.com/thesquash/.
Choices
Open source database management systems offer greater flexibility and lower costs while avoiding
vendor lock-in. Finding the right one depends on your project’s needs. By Amy Pettle
W
e live in a digital age where A quick Google search pulls up dozens provide some criteria for choosing a
data is king. To efficiently of DBMS solutions. You’ll find open DBMS. Let’s get started.
manage that data, you source and closed source solutions.
need a database manage- Some DBMSs use SQL to structure their Why Open Source
ment system (DBMS). A DBMS lets you data, while others go the NoSQL route. The first thing you should consider in
store, retrieve, and manipulate your Finally, some DBMSs are better suited selecting a DBMS is whether to use
data. It functions as a mediator between for enterprise environments. closed source or open source software.
the database, applications, the devel- To narrow the field, you need to con- With closed source (proprietary) soft-
oper, and the user interface (Figure 1). sider your project’s data management ware, access to the source code is re-
You can use a DBMS for simple data needs. In this article, I will explain the stricted. Open source software, on the
storage and retrieval or for more com- advantages of an open source DBMS so- other hand, gives the user the right to
plex data-driven tasks. lution, break down the differences be- freely use, modify, and share the
When it comes to choosing a DBMS, tween SQL and NoSQL DBMSs along source code.
you may be overwhelmed with options. with some examples of each type, and Proprietary solutions are not without
benefits. Sometimes a proprietary
DBMS offers a unique solution that
happens to fit your needs. A vendor
might offer 24/7 support from a single
source or build protection for added
security. However, all of this comes at
a price and can result in vendor lock-
Lead Image © ioannaalexa, 123RF.com
download. You also avoid licensing or or foreign key relationship. A primary MySQL
registration fees for reusing, modifying, key functions as a unique identifier for MySQL [1] was initially released in
or distributing the software, and you each row (aka record) in a given table 1995 and acquired by Oracle in 2010.
won’t be surprised by potentially rising to prevent records from having the As part of the Linux, Apache, MySQL,
renewal costs when your proprietary same value. A foreign key lets you link PHP (LAMP) stack, MySQL is one of
software subscription comes due. tables; it is a column or set of columns the more popular DBMSs. MySQL
You can also sidestep vendor lock-in in one table that references a primary ranked second in the Databases cate-
with an open source DBMS. You won’t key in another table. By combining gory for all respondents on the 2023
be forced to purchase bundled technol- rows from two or more tables based on Stack Overflow Survey and first for
ogy that doesn’t meet your needs. And if a shared related column, you can per- those learning to code [2].
those needs change, you are free to rede- form complex joins. Offering features typical of an
sign your system. With an open source To ensure that database transactions RDBMS, MySQL uses a client/server
DBMS, you can easily scale up or scale (defined as a series of operations) are system with a multithreaded SQL
down to respond to environmental processed reliably, an RDBMS main- server, MySQL Server. MySQL Server
changes. In addition, you are free to tains Atomicity, Consistency, Isolation, supports different back ends, client
try out new open source apps without and Durability (ACID) compliance. programs and libraries, administrative
affecting your budget. ACID compliance is an all-or-nothing tools, and APIs. It is a transaction-safe,
You will find open source DBMS approach – either all changes within a ACID-compliant RDBMS that offers fea-
solutions at work in a wide range of transaction are committed or none of tures such as full commit, rollback,
industries, including e-commerce, them are. If a transaction is ACID com- crash recovery, and row-level locking.
healthcare, government, nonprofit pliant, you are guaranteed that a data- MySQL has the advantage of being
organizations, financial services, and base is consistent before and after the quick to install and easy to learn and
the high-tech field. transaction. Mission critical applica- manage. It is capable of handling a large
tions in particular require ACID volume of data and is compatible with
Common DBMS Types compliance. most operating systems.
While there are several types of DBMSs, Most RDBMSs scale vertically, with MySQL delivers high availability and
the two most common are relational the data residing on a single server. To disaster recovery thanks to its fully inte-
DBMSs (RDBMSs) and non-relational scale up, you can add more power to grated replication technologies. To han-
DBMSs. An RDBMS stores data in a the server (CPU, GPU, RAM), but scal- dle read scaling and report query
highly structured format. Most RDBMS ing usually requires downtime because offloading, MySQL relies on simple syn-
systems today use Structured Query you have to take the server offline to chronous replication, while it uses asyn-
Language (SQL) to store and manage make any upgrades. You can scale an chronous replication for high
the data. A non-relational DBMS, more RDBMS horizontally, where the data is availability.
commonly known as a NoSQL DBMS, spread or shared over multiple servers, In terms of scalability, MySQL’s native
handles less structured data. Both have but it is a much more difficult process. replication architecture lets it scale to
their strengths and weaknesses. Ulti- The complexity of maintaining ACID meet demands. In fact, Facebook uses
mately, your project’s data will dictate compliance and managing distributed MySQL [1] to scale applications to meet
which type will provide the best transactions and joins can require data its users’ needs. You can also use MySQL
solution. structure changes along with other de- as an embedded multithreaded library
sign considerations. that you can link to standalone applica-
SQL DBMSs Optimized for speed, RDBMSs offer tions, making it smaller, faster, and eas-
Used as back-end data systems for fast SQL queries. They perform well for ier to manage.
decades, SQL DBMSs are the most com- intensive read/write operations on small For security, MySQL uses built-in
monly used type of DBMS. An SQL- to medium datasets, but performance data masking and dynamic columns.
based RDBMS implements a predefined can begin to suffer if the number of user It is popular with e-commerce, web-
strict schema, which defines how the requests or the amount of data grows. To based transactions, data warehousing,
data is organized (including logical improve data retrieval speed, you can and online transaction processing
constraints such as table names, fields, add indexes to data fields to query and (OTP). It can even be used with some
data types, and relations). With a focus join tables. online analytical processing (OLAP)
on consistency and availability, an If you have highly structured data workloads.
RDBMS works best for data that is that doesn’t change frequently, an MySQL Community Edition [3] is free
structured and related. RDBMS is a good choice. It offers a and open source under a GPL license
Data in an RDBMS is stored in tables higher degree of data integrity, is able and boasts a large active community of
consisting of rows (or records) and col- to handle complex queries, and is a open source developers, but it does not
umns (or record attributes). Each table better choice for transaction-oriented receive MySQL support. If you plan to
represents a relation with the rows systems thanks to its ACID compliance. embed MySQL in a commercial app,
holding individual records that pertain Examples of open source RDBMSs in- you will need a commercial license. The
to that relation. You can connect one clude MySQL, MariaDB, PostgreSQL, proprietary MySQL Enterprise Edition
table to another using either a primary Firebird, and CUBRID. provides additional components for
monitoring and online backup, high architecture, reliability, data integrity, used in production systems under vari-
availability and security enhancements, a robust feature set, and extensibility. ous names since 1981.
and is built to interface with other enter- Originally part of the POSTGRES proj- Supporting many ANSI SQL standard
prise-grade tools. ect at UC Berkeley, it has over 35 years features, Firebird uses Procedural SQL
of active development. In the Stack (PSQL) as its internal language for stored
MariaDB Overflow Developer 2023 Survey, it procedures. It is 100-percent ACID-com-
When Oracle acquired MySQL in 2010, took first place in the Databases cate- pliant and uses a multi-generational ar-
MySQL founder, Michael Widenius, gory among all respondents as well chitecture (similar to MVCC) to ensure
created MariaDB [4] as a fork of MySQL. professional developers [2]. OTLP and OLAP operation. It offers care-
Today, MariaDB has evolved beyond While PostgreSQL uses SQL, it ex- ful writes that result in fast recovery
being a drop-in replacement to MySQL. tends the query language by adding with no need for transaction logs. With a
It has added many new features, but it features for safely storing and scaling small footprint, Firebird requires mini-
still maintains a high level of compatibil- complicated workloads. It conforms to mal configuration.
ity with MySQL. Consequently, most ap- the current SQL standard unless the Firebird achieves high availability
plications that work with MySQL work standard contradicts traditional fea- through optimistic locking at the record
with MariaDB. tures or leads to poor architectural level, which reduces wait times. It
MariaDB supports modern SQL fea- decisions. While most required SQL adapts to fluctuating workloads and can
tures such as common table expressions features are supported, they may have be replicated to safeguard against disk
and temporal data tables. It also offers a slightly different syntax or function failure. Online backups using snapshots
a large number of JSON functions to in PostgreSQL. allow for 24/7 operation.
handle unstructured data [5]. Because PostgreSQL can interact with both You can scale Firebird in any direc-
MariaDB uses a dynamic thread pool, relational and object-oriented databases tion with growth limited to available
which allows threads to be retired, it and offers features not traditionally disk storage, which can be spread
benefits from improved speed, enhanced available in RDBMSs. Because Post- across multiple hard disks. With Firebird,
replication, and faster updates. greSQL supports complex objects, table the database engine manages the data-
You can extend the MariaDB front end inheritance, and additional data types base’s on-disk structure independent
beyond pure transactional processing beyond JSON, it can handle more com- of the filesystem. You can embed the
with pluggable storage engines from In- plex workloads and schema designs. engine as a standalone client applica-
noDB, MyRocks, Aria, ColumnStore, and As a result, PostgreSQL can perform tion, deploy it in a 2-tier client/server
other third-party engines. With the complex queries in a large database LAN with support for up to 750 users,
MariaDB ColumnStore plugin, you can more quickly. or use it in a multi-tier system with
perform columnar analytics or hybrid ACID-compliant since 2001, Post- thousands of users.
smart transactions. greSQL achieves consistency through Firebird offers security via user au-
A Galera cluster engine allows for rep- Multi-Version Concurrency Control thentication at the server level, SQL
lication and state transfer. In terms of (MVCC) and Write-Ahead-Logging privileges within the database, optional
high availability, MariaDB MaxScale (WAL). It offers asynchronous replica- database encryption via plugins, and
(available under a BSL license) can be tion, failover, full redundancy, full-text wire protocol encryption between client
used to provide a database proxy with database searches, and native support and server.
failover and transaction replay for JSON-style storage, key-value stor- Developed by a large community of in-
capabilities. age, and XML. In addition, you can dependent C/C++ programmers, techni-
MariaDB use cases include web and customize PostgreSQL with plugins cal advisors, and supporters, Firebird
mobile apps, data warehousing, and and extensions. provides documentation on their website
data analysis. Companies such as Wiki- PostgreSQL scales well but has a along with community support through
pedia, WordPress.com, and Google all heavier install footprint than MariaDB or mailing lists [8].
use MariaDB. MySQL. Use cases include complex data With Firebird’s Initial Developer’s
MariaDB operates under a GPLv2 li- analysis, data science, AI-related capa- Public License (IDPL), you can build a
cense and promises to remain open bilities, finance, manufacturing, and custom version as long as any modifi-
source. You can freely copy it into your geographic information systems (GIS). cations are made available under an
project’s local package repositories, PostgreSQL is open source under the IDPL license, which enables other
which makes the deployment process PostgreSQL License, which is similar to users to build on and use your modifi-
easier and limits licensing obligations. an MIT or BSD license. cations. An InterBase Public License
MariaDB Community Server is free, but (IPL) covers the source code inherited
the subscription-based MariaDB Enter- Firebird from InterBase, while the IDPL license
prise Server is recommended for pro- Firebird [7] offers concurrency, high applies to any additions or improve-
duction environments. performance, and language support for ments made by the Firebird Project to
stored procedures and triggers. Based on the original InterBase code. There are
PostgreSQL InterBase 6.0 source code released in no fees to download, register, license,
PostgreSQL [6], an object-relational 2000 from Inprise (now the Borland Soft- or deploy Firebird “even for commer-
DBMS (ORDBMS), is known for proven ware Corporation), Firebird has been cial developers.”
a commercial version, MongoDB Enter- failures. Redis supports asynchronous data management needs. In making your
prise Advanced. replication, and replication and auto- decision, there are several things you
matic failover are available for both should consider.
JanusGraph standalone and clustered deployments. First, think about how your project
As its name suggests, JanusGraph [12] Clustering provides horizontal scal- uses data to determine which DBMS
is a scalable, distributed, NoSQL graph ability in Redis with hash-based shard- will provide the needed features, capa-
DBMS that lets you store and query ing. To grow a cluster, automatic repar- bilities, and complexity. In terms of ca-
graphs with hundreds of nodes and titioning lets Redis scale up to millions pacity, look at how data will flow
edges. JanusGraph was initially re- of nodes. through your database structure with
leased in 2017 as a Linux Foundation Redis is licensed under a three-clause particular consideration for the amount
project. BSD license [14]. You can use, modify, of data, storage needs, access require-
Thousands of concurrent users can and redistribute Redis in source and bi- ments, and number of concurrent
use JanusGraph in real time to perform nary form as long as you include the users. Another thing to consider is
complex graph traversals. JanusGraph provided copyright notice, list of condi- workload. Not only should you be
supports multiple storage back ends, tions, and disclaimer. You also cannot thinking about your project’s data
including Apache Cassandra, Apache use the names of Redis and its contribu- types, query types, performance expec-
HBase, Google Cloud Bigtable, Oracle tors to promote or endorse products de- tations, and business requirements,
Berkeley DB, and ScyllaDB. Advanced rived from the software without prior but also consider whether your work-
capabilities such as full-text search are written permission. load is consistent or if it experiences
optionally available through third- For documentation, community sup- event-driven changes. Finally, business
party tools such as Elasticsearch, port, and mailing lists, see the Redis site. needs change over time, so look for a
Apache Solr, and Apache Lucerne. DBMS that supports multiple use cases
JanusGraph can also support OLTP and CouchDB (not just your current use case) with
OLAP with Apache Spark integration. CouchDB [15] from Apache, a NoSQL no downtime. A rigid system with lim-
To ensure performance and fault tol- document-based DBMS, is a single-node ited use cases may force you to migrate
erance, JanusGraph uses data distribu- DBMS that works behind the application to a different DBMS in the future if
tion and replication. It also offers multi server of your choice. your needs change.
data center high availability and hot With a schema-free document model In terms of scalability, determine
backups. and built-in query engine, CouchDB whether a given DBMS can meet your
JanusGraph can scale to accommodate can handle spikes in usage. It uses an organization’s evolving data usage,
an increase in both data and users, but intuitive JSON/HTTP API, supports bi- customer needs, and compliance re-
its scalability ultimately depends on the nary data, and can be used on servers quirements over time. For enterprise-
back end you use with it. ranging from a Rasp Pi to a cloud grade systems, it’s particularly impor-
JanusGraph is totally free under an installation. tant to consider high availability and
Apache v2 license. The CouchDB Replication Protocol disaster recovery and whether a DBMS
[16] makes Offline First apps possible. can scale to meet those demands. Also,
Redis An Offline First app lets an organization don’t rule out the need to scale down
Redis [13] is a key-value, NoSQL, in- perform some or all of its business logic in the future.
memory store used as a DBMS. You without access to an Internet connec- In considering a DBMS, take a close
can use Redis as a cache, a vector data- tion. This is particularly useful for mo- look at its community support. A strong
base, a document database, a stream- bile apps or any environment with an active community means developers are
ing engine, or as a message broker. In unreliable network infrastructure. actively working to make stronger, more
its most popular use case, Redis can CouchDB is designed for reliability. secure code. Because free open source
function as an enterprise-class session Single nodes rely on a crash resistant, DBMSs don’t usually include tech sup-
cache. append-only data structure, while port, look to see if the community sup-
To handle caching, queuing, and event multi-node clusters use redundancy to plies tutorials, forums, and documenta-
processing, Redis supports in-memory make your data available whenever tion. In addition to being a good source
data structures (e.g., strings, hashes, you need it. of information, these resources will
sets, sorted sets, streams, etc.). You can You can easily scale CouchDB horizon- come in handy if you encounter any
program Redis using server-side script- tally by adding more clusters to meet technical issues.
ing with Lua and server-side stored pro- your demands. While open source is free, keep in
cedures with the Redis Functions API. CouchDB is licensed under an Apache mind that there may be associated
You can also build custom extensions v2 license. You’ll find documentation, costs with implementation. Most open
for Redis in C, C++, and Rust using a community chat channels, and mailing source DBMSs charge extra fees for en-
module API. lists on the CouchDB website. terprise-grade plans and additional fea-
With Redis, your dataset is kept in tures. If you need more robust support
memory for fast access. You can also Making a Choice than what is provided by the commu-
persist all writes to permanent storage Picking the right open source DBMS nity documentation and forums, you
to protect against reboots and system comes down to your project’s unique may need to pay for the commercial
version or contract with a third-party same location. And in case of a high With the right considerations, you can
vendor. availability failure, you’ll want a solution use these open source solutions in en-
In addition to cost, you should also with a disaster recovery plan to get your terprise environments. To find the best
consider the technical skills required to system back up and running. one for your organization, a close look
implement and maintain an open Finally, you may need to consider at your present and future needs will
source DBMS. Do you have existing paying for additional extensions or help you narrow down the field. Q Q Q
staff (with both the skills and the re- add-ons that can make an open source
sources) to handle your chosen DBMS? DBMS enterprise-grade. Most of the This article was made possible by support
If not, then you need to factor in DBMSs highlighted here offer an enter- from Percona LLC, through Linux New
whether you can afford to hire more prise version. You may also need to Media’s Topic Subsidy Program (https://
staff or get outside help. consider paying for outside 24/7 sup- www.linuxnewmedia.com/Topic_Subsidy).
Two final things to consider are secu- port from a third-party vendor if you
rity and compatibility. When it comes to don’t have the staffing resources in Author
security, look for DBMS features such as house. Amy Pettle is an editor for ADMIN and
secure logins, encryption, role-based ac- Linux Magazine. She started out in tech
cess, control, and compliance. As for Conclusion publishing with C/C++ Users Journal
compatibility, you need to ensure that An open source DBMS can offer many over 20 years ago and has worked on
the new DBMS software works with the benefits over closed source alternatives. various Linux New Media publications.
existing software and tools that your or-
ganization uses. Info
[1] MySQL: https://www.oracle.com/ [8] Firebird support:
Enterprise Considerations mysql/what-is-mysql/#:~:text=the%20 https://firebirdsql.org/en/mailing-lists/
If you plan to use your DBMS in an en- SQL%20syntax.-,MySQL%20is%20 [9] CUBRID:
terprise-grade capacity, you’ll need to open%20source,code%20to%20 https://www.cubrid.org/cubrid
consider a few additional criteria. suit%20your%20needs
[10] Apache Cassandra:
First and foremost, you want to look [2] Stack Overflow 2023 Developer Sur-
https://cassandra.apache.org/_/index.
for a solution that promises stability. vey: https://survey.stackoverflow.co/
html
Depending upon how you’ll use the 2023/#section-most-popular-
technologies-databases [11] MongoDB: https://www.mongodb.
distribution, ACID-compliance is par-
com/try/download/community
ticularly important for stability. For all [3] MySQL Community Edition:
https://www.mysql.com/products/ [12] JanusGraph: https://janusgraph.org
use cases, a proven track record, a
planned release cycle, high quality community/ [13] Redis: https://redis.io/
documentation, and support forums [4] MariaDB: https://mariadb.com/ [14] BSD 3 clause license:
can provide assurance regarding the products/community-server/ https://opensource.org/license/
DBMS’s longevity and community [5] MariaDB JSON functions: https:// BSD-3-Clause
involvement. mariadb.com/kb/en/json-functions/ [15] CouchDB:
For enterprise-grade systems, high [6] PostgreSQL: https://couchdb.apache.org/#top
availability is particularly important. https://www.postgresql.org/about/ [16] CouchDB Replication Protocol:
Look for solutions that offer redundant [7] Firebird: https://firebirdsql.org/en/ https://docs.couchdb.org/en/stable/
and fault-tolerant components at the about-firebird/ replication/protocol.html
QQQ
Sometimes the only way to break into an SSH server is through can highlight open ports on IP ad-
dresses in a matter of seconds. At that
brute force – and yes, there are tools for that. By Chris Binnie point, I can sort the wheat from the
O
chaff very quickly and focus my atten-
ne particular Linux service that you own or explicitly have permis- tion on items of interest.
needs no introduction: Secure sion to test against. A number of these On Debian Linux derivatives, like
Shell (SSH) is synonymous with approaches could cause downtime or ul- Ubuntu Linux, you can install masscan
logging into remote Linux de- timately lock you out of the target SSH using the following command:
vices of all varieties. You can use SSH to instance.
log into a Raspberry Pi, a mail server, a $ apt install masscan -y
began to warn that it had its own secu- A number of tools are specifically robertdavidgraham/masscan
$ cd masscan --echo > masscan.conf You can also use masscan to discover
$ make open port banner information. Have a
a reusable configuration file is created. look at the README.
masscan has a limited number of depen- Listing 3 shows the contents of the mass-
dencies, so it is relatively simple to in- can.conf file. Protect the Innocent
stall; refer to the README if you have Rather than entering a long, complex Imagine you have identified a couple of
any questions. command at the command line, you SSH servers using masscan and want to
Once masscan is installed, you can get can use the configuration file, as look closely at how SSH is set up. At this
some practice by trying it out on a Cap- follows: point, I would turn to a tool called ssh-
ture the Flag (CTF) exercise at the inimi- audit [4]. I think you might be surprised
table TryHackMe [3] website. I use the $ masscan -c masscan.conf U at how much information you can get
following command to focus on one spe- --rate 10000 from an SSH server.
cific IP address:
Listing 4: Installing ssh-audit
$ IP="10.10.XX.XX"; masscan U
$ pip3 install ssh-audit
-p0-65535 --rate 10000 ${IP} U
Collecting ssh-audit
-e tun0 --router-ip 10.11.0.1
Downloading ssh_audit-2.9.0-py3-none-any.whl (97 kB)
In the preceding example, I only append ???????????????????????????????????????? 98.0/98.0 kB 317.3 kB/s eta 0:00:00
Listing 3: masscan.conf
seed = 535234345767656361
rate = 100
shard = 1/1
nocapture = servername
ports = 22,2000-3000
range = 127.0.0.0/8
Figure 2: Online scanning with ssh-audit. © https://www.ssh-audit.com
Uphold the Law Shreder is a powerful multi-threaded SSH protocol password brute-force tool.
An excellent addition to the available
tools for scanning SSH servers is a positional arguments:
lightweight tool called ScanSSH [7]. To target
install ScanSSH on Debian Linux, I use
the following command:
options:
have more limited choices for breaching names tend to be email addresses, when EntySec/Shreder
SSH instances. Often the only choice a it comes to SSH, logins tend to follow
potential intruder has for breaching SSH the more traditional user ID format. You’ll see a considerable amount of out-
is a brute force attack. For example, the format for Chris Binnie put during the installation. Once Shreder
Numerous tools exist for brute force might be chrxbinx001, or in a smaller is installed, run it with the -h switch to
attacks on SSH servers, and I’ll discuss organization, just the first name of the view the available options (Listing 5).
a pair of them in this article: Shreder user. To see Shreder in action, consider the
and sshbrute. I should also mention In the pass.txt file, I have the contents following command:
one tool that I couldn’t get to work. of 1000-most-common-passwords.txt [11],
The tool is called mec (Mass Exploit which, as the name suggests, is roughly $ shreder -p 2002 -u chris -l U
Console) [8], and you can find it on 1,000 password entries long. pass.txt -d 1 18.XX.XX.XX
GitHub [9].
Et Tu, Brute?
The Internet is lit-
tered with pass-
word lists that
were mostly com-
piled from suc-
cessful attacks. If
you are in any
doubt how perva-
sive such lists are,
search with your
favorite search en-
gine, or visit
GitHub and look Figure 6: ScanSSH can also work with proxies.
The command structure is really simple. Once the necessary Python modules are you can do to prevent this kind of attack.
The SSH server is running on a non-stan- in place, it is just a case of cloning the The following is a non-exhaustive list of
dard port (TCP port 2002). I know the user repository. Although the docs mention a tips for preventing attacks on SSH. It is
chris exists, and I’m pointing the 1,000 requirements.txt file, the tool seems to worth underlining the fact that rarely do
common passwords file (named pass.txt) check Python modules automatically, so any of these individual security controls
at the referenced IP address. So that I don’t you can just ignore that part, after run- act as the panacea to fully secure your
stress my laptop or the SSH server, rather ning the following commands: SSH servers. However, making use of
than using a delay of one second as shown several of these techniques in tandem
in the previous command (-d1), I chose $ git clone https://github.com/U will give you a fighting chance to stop
-d3 instead to ensure that Shreder pauses machine1337/sshbrute attackers. Having said all of that, the first
for three seconds between login attempts. $ cd sshbrute item of the list will make a massive dif-
When I run that command, the output ference to how long your SSH server
is as follows: The directory listing looks like this once lasts online without being compromised
the repo is cloned: (even if you don’t patch the OpenSSH
[*] Processing... \ | U package for a while).
Passwords tried: 15/1025 $ brute.py LICENSE README.md • Limit IP addresses – this is the first
thing I do whenever I set up an SSH
If I reduce the delay to one second, after To run the tool, simply type: server. Building an allow list can be
a while I start getting Connection reset difficult without a permanent IP ad-
by peer error messages. If I add the $ python3 brute.py dress, if your broadband router uses
password that I know is valid to the dynamic public IP addresses or your
password file as a test, Shreder is really As Figure 7 shows, you’ll need to an- VPN shifts IP addresses frequently. If
efficient and gets straight to the point, swer a few simple questions; then this is the case, I’d recommend look-
showing success almost straight away: you’ll be presented with a simple user ing for alternatives (I’ll leave you to
interface (UI). research them online). Be creative and
[+] Password has been found! Within a matter of moments, sshbrute use options like Port Knocking, which
[i] Password: U found the password and happily output I wrote about in ADMIN [15]. If you
nothingtoseehere the following information: do have a permanent IP address, think
[i] Time elapsed: U about using iptables or TCP Wrappers
34.217505216598511 [+] BruteForce Started.... (which I also wrote about in ADMIN
[*] Username: chris | [*] U [16]) to lock down SSH server access.
With a valid username in hand and a pass- Password: nothingtoseehere I cannot emphasize the benefits of
word list, you now have the ability to test [?] Valid Credentials Found doing so strongly enough. It really is
the brute forcing of SSH servers with ease. [*] Completed... important to limit access to a select
few IP addresses. You might even con-
Unhand Me It’s an impressive if somewhat worrying sider only letting, say, 256 IP ad-
An alternative tool you could use for a process, from start to finish, just as using dresses from your broadband router’s
brute force attack on an SSH server is Shreder was. Clearly attackers that have IP address pool connect.
sshbrute [14], which is provided by a the ability to test hundreds of thousands • Enforce SSH keys – you should use
Red Team fan called Machine1337. of passwords against a username with keys and not passwords. If you must
To install sshbrute, which is also writ- impunity have to wait much longer for use passwords, make them greater
ten in Python, the prerequisites are Py- results. Nonetheless, it is much easier than 12 characters long and complex.
thon 3 and a couple of Python modules, than many SSH users think. As for SSH keys, don’t leave them on
which you can install with: devices that don’t use them. Treat
Cut It Out them like precious passwords. Finally,
$ pip install paramiko If compromising an SSH server looks all you can also configure Pluggable Au-
$ pip install termcolor too easy, you might be wondering what thentication Modules (PAM), using
one of the many
online guides [17],
to limit the number
of failed logins and
enforce password
complexity. Be sure
to test carefully to
avoid causing ac-
cess problems.
• Monitor logins –
If you run five or
Figure 7: sshbrute presents a simple but effective UI. fewer servers and
Listing 6: Email Alert cannot recom- inimitable Nmap, when used with its
(who -m |awk -v q="$(date +"%k:%M:%S %Z on %e %B")" \
mend it enough; ssh-brute scripting engine script [24].
'{print "User " $1 " logged in at " q " from IP "$5 }' | \
just take care I hope this article has given you some
with your initial useful insights into how attackers ap-
mail -s "Logged user on $(uname -n)" [email protected] &)
configuration to proach SSH servers. Although SSH servers
avoid locking tend to be better protected than other net-
there aren’t that many logins every yourself out. And, don’t be daunted work services due to their critical nature,
day over SSH, you could employ this by the regular expression configura- they are far from impenetrable. If you
trick, which I have used on personal tion style; you’ll find lots of online don’t impose rate-limiting to limit brute
servers before to keep track of when examples to get you started. force attacks, they are clearly vulnerable.
users log in successfully. I add the I also covered a number of mitigation
command in Listing 6 to the foot of Conclusion techniques. At the risk of repeating my-
the /etc/profile file, and when Bash It is perfectly possible to launch high- self, you can stop brute force attacks with
is spawned for a user login, an email performance brute force attacks against rate-limiting in place, but ideally, you
is sent (note that it is encapsulated in SSH servers using lightweight tools. In should use an allow list to define which
brackets on purpose). See the article this article, I wanted to demonstrate the specific IP addresses will have access.
in Linux Magazine [18]. You will need effectiveness of lesser-known tools, and Whether you are keen to keep your
an outgoing email server to be avail- I think Shreder and sshbrute both passed own servers compliant with security
able; use something like Postfix with a with flying colors. standards or you just want to practice
basic configuration for outbound If you’re wondering what other tools some ethical hacking, the tools de-
SMTP only. are available, two of the most popular scribed in this article will keep you well
• Harden the /etc/ssh/sshd_config file – options for this kind of attack are Hydra prepared to hack and defend your SSH
there are a plethora of online guides to (also called THC Hydra) [23] and the servers. Q Q Q
help you tweak the SSH server’s con-
figuration file [19]. Without fail, you Info
should disable root user access and [1] OpenSSH: https://www.openssh.com [15] “Protect Your Network with Port
ideally add a group of permitted user- [2] masscan: Knocking” by Chris Binnie, ADMIN,
names, alter the standard port that the https://github.com/ issue 23, 2014: https://www.
SSH server runs on (e.g., TCP port robertdavidgraham/masscan admin-magazine.com/Archive/2014/
2222) to prevent endless automated 23/Port-Knocking
[3] TryHackMe:
probes, lower the setting MaxAuthTries [16] “Secure Your Server with TCP Wrap-
https://www.tryhackme.com
to reduce the maximum number of au- pers” by Chris Binnie, ADMIN, 2012:
thentication attempts per connection [4] ssh-audit on GitHub: https://www.admin-magazine.com/
to three from six, and disable X11For- https://github.com/jtesta/ssh-audit Articles/Secure-Your-Server-with-
warding by changing yes to no. The list [5] ssh-audit: https://www.ssh-audit.com/ TCP-Wrappers
goes on and you should do more re- [6] ssh-audit README: [17] About PAM configuration files:
search online if you are unsure. https://github.com/jtesta/ssh-audit/ https://access.redhat.com/documen-
• Remove the version banner – if you do blob/master/README.md tation/en-us/red_hat_enterprise_
not use an allow list for IP addresses, linux/7/html/system-level_authentica-
[7] ScanSSH:
try to avoid advertising the version of tion_guide/pam_configuration_files
https://github.com/ofalk/scanssh
your SSH server to the whole Internet. [18] “Enhance and Secure Your Bash
You might be surprised at how difficult [8] mec: https://github.com/jm33-m0/mec
Shells” by Chris Binnie, Linux Maga-
that is to do in OpenSSH. To switch off [9] mec on GitHub: zine, issue 167, October 2014: https://
the banner, it really does require a bit https://github.com/jm33-m0/mec/ www.linux-magazine.com/Issues/
more effort [20]. blob/master/screenshot/mec.svg 2014/167/Bash-Tricks/(offset)/6
• Use Fail2ban [21] – I would be remiss [10] SecLists: https://github.com/ [19] SSH hardening guides: https://www.
not to mention Fail2ban; it really is danielmiessler/SecLists/blob/master/ sshaudit.com/hardening_guides.html
one of the most sophisticated open Passwords/Common-Credentials/ [20] Hide OpenSSH version banner: http://
source rate-limiting solutions out 10-million-password-list-top-1000.txt kb.ictbanking.net/article.php?id=666
there. You may be surprised to read [11] Common passwords: https://github. [21] Fail2ban: https://www.fail2ban.org
that I once wrote about it on the com/DavidWittman/wpxmlrpcbrute/
Linux Magazine website [22]. I [22] “Intrusion Detection with Fail2ban”
blob/master/wordlists/1000-most-
by Chris Binnie: https://www.
common-passwords.txt
Author linux-magazine.com/Online/Features/
[12] Shreder: Intrusion-Detection-with-fail2ban
Chris Binnie is a Cloud Native Security
https://github.com/EntySec/Shreder
consultant and co-author of the book Cloud [23] Hydra: https://github.com/vanhauser-
Native Security: https://www.amazon.com/ [13] EntySec: https://github.com/EntySec thc/thc-hydra
Cloud-Native-Security-Chris-Binnie/dp/ [14] sshbrute: https://github.com/ [24] ssh-brute: https://nmap.org/nsedoc/
1119782236. machine1337/sshbrute scripts/ssh-brute.html
Pivot tables let you sort, rearrange, group, and perform calculations on your spreadsheet data.
We help you get started with this powerful tool. By Marco Fioretti
A
lmost everyone today works one column according to the values in without ambiguities because it is sur-
with spreadsheets from time to another column or columns and then rounded by empty columns and rows, or
time. While most users are fa- makes a calculation on each group and because there are simply no other data in
miliar with spreadsheet formu- presents the data in the most compact the spreadsheet, you can click on any of
las, not everyone is familiar with another way possible. As an example, take the those cells to start creating the pivot table.
power feature, the pivot table. In this ar- simple spreadsheet in Figure 1; it has Otherwise, you will have to select all the
ticle, I show what a pivot table looks like three columns: Area, Year, and Contacts. cells that must be processed. Once you’ve
in LibreOffice Calc and then explain The pivot table in Figure 2 is created by selected the cell or cells you want to pro-
what they do. Even if you are just a basic grouping all the data from the Contacts cess, open the wizard by clicking on either
spreadsheet user, you won’t regret learn- column according to the values in the Insert | Pivot Table or Data | Pivot Table |
ing about pivot tables. Areas column. Insert or Edit… in the main menu.
A pivot table lets you automatically In the wizard, drag the column you
process the raw data from a spreadsheet The Pivot Table Wizard want Calc to analyze from the Available
(or part of it). It groups all the data in LibreOffice Calc provides a handy wizard Fields: box into the Data Fields: box
for creating pivot tables (Figure 3). To get (Contacts in my example), and the
started, you first need to select the data spreadsheet column title (Area) you
from the spreadsheet around which you want to group on into the Row Fields:.
are going to pivot. If the cluster of cells For your pivot table to work, the first
that you want to use is recognizable row in your spreadsheet must contain
the column title; otherwise, the column
title won’t appear in Available Fields,
and there will be nothing to pivot
Lead Image © vectordivider, 123RF.com
Manipulating Data
With a pivot table, you can rotate your data summary to look
at it from different angles. Pivot tables offer a quick and easy
way to sort, group, and process the raw data from a spread-
sheet in order to rearrange, summarize, and expose further in-
formation about the data.
Figure 3: The Calc interface used to create pivot tables. While all the examples in this article show sums of numbers,
you can use other mathematical functions (as shown in Figure 7)
to process the data. Most functions (e.g., Sum, Average, Median,
Max, etc.) are only applicable to columns of numbers, but there
are also functions such as Count that can accept columns of
strings to count how many
times each string appears
in a column.
Once you have created
a pivot table, you can
also sort its rows in dif-
ferent ways, or see only
the rows you need by
clicking on the pull-
down menu at the top of
a column (as shown in
Figure 8).
In general, to the extent
Figure 6: The pivot table that a spreadsheet can be
generated from Figure 5 equated to a relational
using the rightmost column database, a pivot table is
as the grouping key. a query against that data-
Figure 4: A spreadsheet of worldwide population base, or some of its cells,
forecasts for 2024-2033. that shows the results in
table format. In practice,
besides manually select-
ing cell ranges, you can
create a pivot table from
the two other types of
sources, named ranges
and registered data
sources.
A named range is sim-
ply a section of a spread-
sheet with an explicitly
given name that’s easy to
remember and call from
other parts of the spread-
sheet. In Calc, you can
name a range of cells by
selecting them and then
Figure 5: The pivot table wizard showing the column Figure 7: Pivot tables can clicking on Sheet | Named
titles selected from Figure 4 along with the optional perform a number of Ranges and Expressions |
parameters available. mathematical functions. Insert in the main menu.
A registered data source, on the other register data sources in Calc, see the offi- shown in Figure 4 and restricts the data
hand, is an external database or spread- cial documentation [2]. to only two years per area to make the
sheet that has been previously “regis- Regardless of the name or location, following figures more readable. From
tered” in LibreOffice. This lets you use pivot tables let you group data in a left to right in Figure 10, the columns
all of the external spreadsheet’s content spreadsheet by one or more properties, are divided into country type, country
as if it were just a local sheet within the in order to show an aggregate value for type per continent, the number of stu-
current spreadsheet. To learn how to each group. dents starting primary school, and the
year. (I will explain why I highlighted
Complexity Simplified some cells later).
Don’t be fooled by the simplicity of the If you select two fields as Row Fields
pivot tables shown here. For example, in the wizard (Figure 11), Calc will cre-
the pivot table in Figure 9 shows the ate the pivot table shown in Figure 12,
same data from Figure 1, but it now pro- where the first two columns are both
cesses and sorts the data using two used as grouping keys.
grouping keys, Area and Year. Each of Now, go back and look at Figure 9,
the numeric values in Figure 9 can be followed by Figures 11 and 12. Then,
the sum (or some other function) of compare those figures with Figure 13.
multiple rows containing the same val- Comparing those pivot tables (and their
ues in the Area and Year columns. Creat- data, of course) with their respective
ing a table like this without pivot tables wizard options will help you under-
would require a lot of manual work! stand the purpose of each wizard field,
Pivot tables let you convert a spread- along with the effects of moving around
Figure 8: You can filter and sort sheet containing multiple columns and available fields.
the rows of a pivot table. rows into a very compact and self-ex-
plaining format Charts
with just a few Another great feature is the ability to
clicks. The ability turn your pivot table into a dynamic
to perform visual chart (Figure 14) using the same proce-
synthesis on dure you would use with normal cells.
quantitative data The charts contain the same control fil-
like this, with very ters present in the pivot table (Figure 15)
little effort, is very allowing you to see only the parts of a
valuable both for chart that you really need (Figure 16).
discovering impor-
tant information Advanced Features
about your data The Filters field (Figure 3) lets you fur-
and making ther narrow your data. Using the data
decisions. from the spreadsheet in Figure 1, drag
Figure 10 reor- and drop Year from Available Fields
ganizes the into Filters. Calc then will build the
Figure 9: Pivot tables easily sort and process data spreadsheet data pivot table shown in Figure 17 instead
according to multiple keys.
of the one shown in Figure 2. Then, pop-up File Criteria dialog (Figure 18), Limitations and Gotchas
you can select which year or years you where you can use Boolean operators, as As good as pivot tables are, there are a
want to filter from the drop-down well as regular expressions, to create and few limitations. Depending on your
menu, and Calc will only show the combine as many filters as desired. hardware and your spreadsheet’s size
contacts for that year in the Contacts You will find other useful features and complexity, generating a big pivot
column. under Options. To add up the totals for table may take several minutes.
For more complex filters, click on the each column or row, select Total col- Another thing to remember is that
Options drop-down menu and select Add umns or Total rows (as shown in Fig- pivot tables are not dynamic. Unless you
filter (see Figure 6), which opens the ure 17). Ignore empty rows tells the explicitly tell Calc to update it, a pivot
table will never change, even if you
modify its source data. To be sure you
are looking at current data, go to Data |
Pivot Table | Refresh or right-click on any
table cell and click on Refresh in the
pop-up menu.
contacts from Aus- very easy to build. You’ll also find uses
tralia (the new for pivot tables outside of ordinary
row 4 in the pivot spreadsheets, as shown in my datamash
table), not Central article [4]. So, try them, I say! Q Q Q
Asia! Instead of
using cell coordi- Info
nates, you must [1] United Nations, World Population
reference those Prospects 2022 - Special Aggregates:
cells with the https://population.un.org/wpp/Down-
function GETPIVOT- load/SpecialAggregates/Geographical/
DATA(), which [2] Registering data sources in Calc:
uses category https://books.libreoffice.org/en/BG72/
BG7207-LinkingToDatabases.html#
names (e.g., “Cen-
toc5
tral Asia” and
[3] Calc Guide 7.1:
“Contacts”) as co- https://books.libreoffice.org/en/CG71/
ordinates instead CG7108-PivotTables.html
of row and col- [4] “Data Processor” by Marco Fioretti,
umn labels. This Linux Magazine, issue 278, January
Figure 16: The chart shown in Figure 14 after ensures that for- 2024, pp. 48-53
applying the filters shown in Figure 15. mulas will always
use the correct Author
In addition, caution needs to be taken value, even if the value moves to another Marco Fioretti (https://mfioretti.substack.
in using pivot table cells in formulas in cell. (See [3] for more information on com) is a freelance author, trainer, and re-
other cells outside the pivot table. Cer- using formulas in pivot tables.) searcher based in Rome, Italy, who has been
tain changes to a pivot table’s source working with free/open
source software since
data will change the position of some of Conclusion
1995, and on open digital
its data in terms of numbers and rows. A pivot table is a powerful tool that lets
standards since 2005.
For example, if Figure 2 was a normal set you manipulate your data in many use- Marco also is a board
of spreadsheet cells instead of a pivot ful ways. Once you become familiar with member of the Free
table, you could reuse total contacts the pivot table wizard and its options, Knowledge Institute
from Central Asia pivot tables are (http://freeknowledge.eu).
by writing a for-
mula that identi-
fies that cell by
column and row
coordinates (i.e.,
B4). The problem
arises if you later
add a row to the
original spread-
sheet (Figure 1)
for contacts from
Antarctica, which
in turn adds Ant-
arctica as row 3 in
the Figure 2 pivot
table and forces
the other cells to
reorder. Any for-
mula that refer- Figure 17: You can add filters to the Figure 18: For custom filters, the Filter Criteria dialog
ences cell B4 will pivot table to restrict the data to a lets you use Boolean operators and regular
now use the single criterion, e.g., a specific year. expressions.
QQQ
Tiger, Tiger,
Burning Bright
The revived Tiger provides a comprehensive set of security audit and intrusion detection tools.
By Bruce Byfield
A
n application with a long his- seeming to cover every aspect of a Linux complete list, along with brief explana-
tory, Tiger [1] was first devel- system imaginable, with the exception of tions of each. Given Tiger’s modular
oped to help secure Unix sys- kernels. From networks, Apache, and structure, it is possible still more will be
tems on the Texas A&M Univer- printers in external connections to boot added as computers evolve. For in-
sity campus. It was released in 1994, managers, logs, configuration files, pass- stance, new modules for AI seem likely
around the same time that many other words, accounts, and groups in the sys- in the future.
well-known classic security tools ap- tem structure, Tiger analyzes them all in Tiger was originally written for Unix
peared, such as COPS, SATAN, and a variety of ways. Even missing patches, and then for Debian and Red Hat Linux.
John the Ripper. Since then, the project dormant users, and expired passwords You get glimpses of the code’s age some-
has forked and ceased development, are included. In all these areas, Tiger times in such references as the name
only to be revived in recent years as a checks for configurations, duplications, lilo.check, the module for all bootloader
convenient framework for modern se- inconsistencies, incorrect or vulnerable scripts named for the dominant boot-
curity requirements on Unix-like oper- configurations, and unapplied patches, loader around the turn of the century.
ating systems. as well as security intrusions. Often, it However, today, Tiger is available in
Summarizing Tiger is a challenge. Ba- draws on other security applications in- many other distributions. Although for
sically, Tiger is a collection of Bourne stalled as dependencies. To give a full greater security, you may prefer to
shell scripts, C code, and data files. The list of Tiger’s modules here is impracti- download the latest release from the
Debian version includes 43 modules, cal, but its man page [2] provides a project’s website.
Whatever the source, Tiger’s installation
includes the setup of mail (Figure 1), as
well as the creation of encryption keys for
Photo by Efe Ya ız Soysal on Unsplash
login shell. Similarly, if Tiger reports that security concerns should help you to (Figure 7). Once you have tweaked
“Use of cron is not restricted,” a sys choose the appropriate responses. Tiger, you may want to use /etc/tiger/
admin of a large network might want to After running Tiger a few times, you cronrc to select some of the suggested
leave that alone to ensure users are not may want to edit /etc/tiger/tiger.ig- choices or to add your own scheduling
scheduling events, while a home user nore so that you no longer see warnings (Figure 8).
who is the only one using a computer you know are harmless. The Debian ver-
would probably not be concerned. Your sion installs with a few entries already in Challenges and Alternatives
responses will be contextual, although /etc/tiger/tiger.ignore, based mainly The simple reason for Tiger’s revival is
having some understanding of basic on configurations peculiar to Debian that its comprehension is needed more
than ever in this security-conscious era.
Still, that same virtue can also be a flaw.
Currently, Tiger has no option for a dif-
ferent selection of scripts as part of the
command structure. Users who only
want a specific piece of information once
either have to edit the configuration file
or endure a long runtime and scan
through a log of irrelevant information.
Alternatively, you can create multiple
configuration files for commonly re-
peated checks, using the option -c FILE,
but that is considerable work and may
Figure 6: Use the -e option to get log explanations for help not be practical. Tiger is not structured
determining how to respond to Tiger’s findings. with convenience as a goal.
Fortunately, for those in a hurry,
Linux has no shortage of other security
checkers. Tiger’s Debian page lists
many dependencies that can be run in-
dependently of Tiger, such as Binutils,
lsb-release, net-tools, chkrootkit, John
the Ripper, Tripwire, and Lynis [3]. Ti-
ger’s Wikipedia entry also lists comple-
mentary applications, such as intrusion
detection systems, log-based intrusion
detection systems, Snare for Linux,
AIDE, integrit, and Samhain. However,
as an all-round security audit, Tiger
has few rivals. Q Q Q
Author
Bruce Byfield is a computer journalist and
a freelance writer and editor specializing
in free and open source software. In
addition to his writing projects, he also
teaches live and e-learning courses. In his
spare time, Bruce writes about Northwest
Coast art (http://brucebyfield.wordpress.
com). He is also cofounder of Prentice
Figure 8: /etc/tiger/cronrc is used to schedule Tiger to run regularly. Pieces, a blog about writing and fantasy at
Common alternatives are suggested. https://prenticepieces.com/.
Smart Shell
The PaLMShell.bash script lets you connect to the Google Pathways API Large
Language Model (PaLM) from the command line. By William F. Gilreath
T
he PaLM shell is a simple Bash which was created by over 1,000 re- LLM. Google PaLM is available through
script that interfaces with the searchers in artificial intelligence for Google’s PaLM API, a RESTful inter-
Google Pathways large language large-scale public access. face to the online LLM service.
model (LLM) API [1] over a For the shell-based LLM client de- Several programming languages sup-
RESTful API. This method for accessing scribed in this article, I will use the Path- port access to the PaLM API, including
Google Pathways is simpler than a more ways Language Model (PaLM) from Java, Python, Go, C#, and Swift. I
complex implementation in a high-level Google. PaLM was introduced in April downloaded the Java version of the cli-
compiled language. You only need a 2022 [8] and is trained on 540 billion ent and libraries, and after some tinker-
shell script – no other libraries, pack- parameters. This LLM is now part of ing and effort, I got the Java code to run
ages, or configuration files. Google Gemini, which is part of the and work with Google PaLM. But the ef-
Google AI Studio. fort to get the Java source code to work
Large Language Models PaLM is a single LLM that can gener- as a client to the PaLM RESTful API was
Large language models, or LLMs [2], alize across domains and tasks while a challenge. I decided to roll my own
are currently very newly popular. An being highly efficient. In May 2023, Bash-based PaLM client that would be
LLM is a neural network, but a neural Google announced PaLM2, the next accessible from the command line.
network that has been trained on
massive amounts of data, using Prompt Engineering
massive amounts of computing
With the rise of large language models, it An early example of bad prompt engi-
power to create a model [3] with po- has become clear that the quality of the neering was when King Croesus of Lydia
tentially billions of parameters. AI response often depends on the quality asked the Oracle of Delphi whether he
The most well-known and popular of the question. The concept of creating should go to war against Persia [5]. The
LLM is ChatGPT or Chat Generative the right prompt to get the answer you oracle responded that if Croesus at-
Pre-trained Transformer from OpenAI, want from an LLM is known as prompt tacked Persia he would “destroy a great
Lead Image © lightwise, 123RF.com
Access by API Key 2. Infinite command loop, REPL (read, nearly halved to 40 lines of code. Thus
One thing you’ll need to work with the evaluate, print, loop) [9] half the Bash script source code is com-
Google PaLM RESTful API is an API key. 3. De-initialize ments to document and explain the PaLM
You can request the API key online. I re- Initialization simply sets up the PaLM shell Bash script.
quested a key and it took approximately API key and reports the version, date,
three days to receive it. and license. The script checks that the Initialize
PaLM API key is not empty and then cre- The initialize section initializes, or sets
Bash Script Goals ates the external log file name. up, the Bash script (Listing 1). The
The goals for a Bash script that functions The main loop is an infinite loop that header with information about the Bash
as a PaLM shell are simple: continues to read, echo prompts, print to script is reported, including the release,
1. Connect to the Google PaLM RESTful the user, and tee to an external file. This version, and copyright.
API successfully via HTTP. process continues until the one PaLM The script then checks that the variable
2. Operate simply, with only one shell command bye, which causes the infinite PALM_KEY, which is the RESTful API key
command to exit the shell. loop to terminate. provided by Google to access the Pathways
3. Automatically log and record the ses- The last phase of the Bash shell script LLM, exists and is set. If not, the script
sion as a shell transcript to file. is to finalize and de-initialize what is exits, as it cannot access the RESTful API.
Several external utilities were used to simply an exit message, and then clean Then the date and time are used to
construct the PaLM shell as a Bash exit from the Bash shell script to the op- create the transcript or log filename used
script, including: erating system. by the script; the filename generated is
• cURL – connect to Google PaLM REST- then reported to the user.
ful API via an HTTP URL. Implementation
• jq – JSON is used in the payload or The Bash shell script that implements Main Body Infinite Loop
body of the message. the PaLM shell as PaLMShell.bash is a The main body of the bash script for
• sed – classic stream editor. script of approximately 100 lines of code the PaLM shell is an infinite loop (List-
• tee – logs output from the Bash script in length. This short length demon- ing 2). The infinite loop is the software
to an external file with an automatic strates the simplicity of the PaLM shell classic REPL. A while loop continues
name. in comparison to the compiled high-level the process of evaluating prompts (see
Two other utilities were date and tr, language examples available. the “Prompt Engineering” box for more
which were used to create the external Ignoring line
file name used with the tee utility. comments, the Listing 2: Main Body
The PaLM shell has three phases: length of the PaLM 01 while :
1. Initialize shell script is 02 do
03 echo -n "PaLM>"
Listing 1: Initialize 04 read prompt_line
01 PALM_KEY="NHGUBE-PBATENGF-BA-XABJVAT-EBG13-PBQVAT" 05
02 06 #'bye' is bye/exit/quit PaLM Shell
03 # 07 if [ "$prompt_line" == "bye" ]; then
04 # 08 break
05 # 09 fi
06 echo "PaLM Shell v1.1 (c) 2023 William F. Gilreath" 10
07 echo "All Rights Reserved. License is GPLv3 " 11 echo "Prompt: '$prompt_line'" | tee -a $log_file
08 echo 12
09 13 echo | tee -a $log_file #PaLM-$date_time.log
10 # 14
11 # check if PALM_KEY empty, report error need key, exit 15 curl \
12 if [ -z $PALM_KEY ] 16 -s \
13 then 17 -H 'Content-Type: application/json' \
14 echo "Must have PaLM key to use PaLM API!" 18 -d '{ "prompt": { "text": '"\"$prompt_line\""' }}' \
15 exit 1 19 "https://generativelanguage.googleapis.com/v1beta2/
16 fi models/text-bison-001:
18 date_time=$(date | date | tr ' ' '-' | tr ':' '-') output' | sed 's/\\n/\n/g'
20 22
23 echo 25 done
Pick Up
In this month’s column, Mike Schilli writes a special
mail client in Go and delves into the depths of the
IMAP protocol in order to archive photos from
incoming emails. By Mike Schilli
P
opular email clients such as Protocols such as IMAP
Thunderbird or Microsoft Out- (or maybe POP3) check
look make it very easy to auto- whether a user on the mail
matically filter and forward in- server has mail. To do this,
coming messages. I recently had the idea the email client contacts
of sending freshly taken photos to my the server and asks how
account by email, from where my desk- many messages there are.
top computer would automatically re- The client can then either
trieve them at regular intervals, extract download the messages indi- with the s_client command. OpenSSL
the photos from the email body, and pro- vidually and store them in folders or expects the user’s input line by line, en-
ceed to archive them (Figure 1). How even delete them. crypts it, and sends it over to the server.
difficult would it be to fetch emails from My provider for personal use is Fast- The client then decrypts the responses
the provider with a DIY Go program and mail, in my opinion a solid contender in and also displays them line by line –
extract the photos embedded in MIME the rough-and-ready email trade. An ac- Telnet with encryption, if you like.
format in order to store them on my hard count on Fastmail.com, whose incoming The user first logs in with the LOGIN
disk? Luckily, Go libraries make quick mail can be checked in the browser with command and provides their username
work of such problems! a zippy web interface (Figure 2), costs and password. These access credentials
$30 a year. For IMAP-based access, it’s help the provider to grant only autho-
Transport Logistics $60 per year. rized (and paying) customers access to
In order for turnkey email clients to gain their email. To then see whether new
access to your choice of provider’s mail IMAP by Hand mail has been received, SELECT INBOX
server, they require three parameters: the It is not difficult to manually enter the selects the folder of the same name, and
IMAP server including the port, the user- IMAP commands required to retrieve an SEARCH ALL retrieves the numerical IDs of
name, and the password. Besides this, email in an interactive session with the all the messages in the folder. The exam-
you are always asked for the SMTP server. As communication with the ple in Figure 3 returns a long list of 88
server and port, possibly along with any IMAP server is almost always SSL-en- email messages with consecutive num-
other credentials you might have there. crypted nowadays, the terminal program bers as IDs from 1 to 88. The client can
The reason for this is that two com- used in Figure 3 is not Netcat (nc) or then use FETCH to download one or more
pletely different technologies are used to even the venerable Telnet, but OpenSSL of these messages and either display or
collect and send email, and they usually archive them or
run on different servers. perform some
other magic.
Author The client can
Lead Image © Tatiana Venkova, 123RF.com
38 return 0, err
In the Beginning
39 } When email was invented in the 1970s, nobody thought of
40 using it to exchange photos – it was just text that was sent
41 c.Log.Debug("Connect OK") across the wire. As the transport method has not changed to
42 c.Log.Debugw("Login", "user", c.User)
this day, email clients are still forced to encode media data as
text in MIME format 50 years later.
43 if err := c.Cli.Login(c.User, c.Pass).Wait(); err != nil {
Figure 2 shows an email with an attached photo in the Fast-
44 return 0, err
mail web client. The downloaded raw text of the same email
45 } can be found in Figure 4. The Content-Type header is set to mul-
46 tipart/mixed to indicate that the email body contains different
47 c.Log.Debug("Login OK") types of media. The columns of numbers in the boundary attri-
48 return nil
bute define the line boundaries at which the encoding of the
respective parts begins. The JPEG data of the attached photo
49 }
can be found below in text-friendly Base64 format.
Listing 2: imap-fetch.go
001 package main 061 for _, buf := range msg.BodySection {
002 062 rawEmail += string(buf)
003 import ( 063 }
004 "github.com/DusanKasan/parsemail" 064 msgs = append(msgs, rawEmail)
005 "github.com/emersion/go-imap/v2" 065 }
006 "io/ioutil" 066
007 "regexp" 067 return msgs, nil
008 "strings" 068 }
009 ) 069 func (c *conn) ProcessEmail(rawEmail string) error {
010 070 email, err := parsemail.Parse(strings.
011 func (c *conn) UnreadEmails() (*imap.SeqSet, error) { NewReader(rawEmail))
012 ids := new(imap.SeqSet) 071 if err != nil {
013 072 return 0, err
014 // read/write! 073 }
015 mbox, err := c.Cli.Select("INBOX", &imap. 074
SelectOptions{ReadOnly: false}).Wait()
075 c.Log.Debugw("Fetched email",
016 if err != nil {
076 "subject", email.Subject,
017 return ids, err
077 "size", len(email.HTMLBody),
018 }
078 "attms", len(email.Attachments),
019 c.Log.Debug("Select ok")
079 )
020 c.Log.Debugw("Inbox", "messages", mbox.NumMessages)
080
021 if mbox.NumMessages == 0 {
081 for _, a := range email.Attachments {
022 c.Log.Debug("No message in mailbox")
082 data, err := ioutil.ReadAll(a.Data)
023 return ids, nil
083 if err != nil {
024 }
084 return 0, err
025
085 }
026 searchCriteria := &imap.SearchCriteria{Not: []imap.
SearchCriteria{{ 086 c.Log.Debugw("Attachment",
056 } 117 }
057 118 }
058 c.Log.Debugw("Fetched ", "msgs", len(messages)) 119
059 for _, msg := range messages { 120 return nil
060 rawEmail := "" 121 }
Listing 3: store.go
01 package main 25 f, err = os.Create(npath)
02 26 } else {
03 import ( 27 suffix := filepath.Ext(base)
04 "errors" 28 prefix := strings.TrimSuffix(base, suffix)
05 "fmt" 29 f, err = os.CreateTemp(photoDir, fmt.Sprintf("%s-*%s",
06 "os" prefix, suffix))
07 "path/filepath" 30 }
08 "strings" 31
09 ) 32 if err != nil {
10 33 return 0, err
11 func (c *conn) toStore(fpath string, data []byte) error { 34 }
12 var err error
35
13 home, err := os.UserHomeDir()
36 c.Log.Debugw("Write", "name", f.Name(), "size", len(data))
14 if err != nil {
37 _, err = f.Write(data)
15 return 0, err
38 if err != nil {
16 }
39 return 0, err
17
40 }
18 photoDir := filepath.Join(home, "photos")
41 err = f.Close()
19 os.Mkdir(photoDir, 0755)
42 if err != nil {
20 base := filepath.Base(fpath)
43 return 0, err
21 npath := filepath.Join(photoDir, base)
44 }
22
45
23 var f *os.File
46 return nil
24 if _, err := os.Stat(npath); errors.Is(err,
os.ErrNotExist) { 47 }
Listing 4: phimap.go
01 package main 16 emails, err := c.FetchEmails(ids)
02
17 if err != nil {
03 func main() {
18 c.Log.Fatalw("Fetch", err)
04 c := NewIMAP()
05 err := c.Open() 19 }
06 if err != nil { 20
07 c.Log.Fatalw("conn", err)
21 for _, email := range emails {
08 }
22 err := c.ProcessEmail(email)
09 defer c.Close()
10 23 if err != nil {
It’s a Trap! 15 27 }
Figure 6: A test run of the photo archiver with the verbose setting, which includes debug information.
QQQ
MakerSpace
Learn the Julia programming
language on a Raspberry Pi
Cooking Pi
Create GUIs and a web app that connects to sensors.
By Pete Metcalfe
J
ulia [1] is a relatively new instructions for your specific system on-
general-purpose programming line [2]. The latest supported version of
language that has an emphasis Julia can be installed on a Raspberry Pi
on data science and artificial with the Snap package manager. To in-
intelligence (AI). In the past few years, stall Snap, enter
Julia has been gaining momentum. Like
Python and R, Julia can be used in Jupyter sudo apt update
interface (GUI), micro-web server, and sudo snap install julia --classic
$ julia --eval U
'using Pkg; Pkg.add("PiGPIO")'
julia> Pkg.add("PiGPIO")
julia> # view the status of U Figure 2: Julia and Raspberry Pi test setup.
installed packages
julia> Pkg.status() daemon needs to be started before finally write to and read from GPIO
Status `~/.julia/environments/v1.9/U Julia can connect to the Raspberry pin 17, use the interactive code
Project.toml` Pi pins:
[bb151fc1] PiGPIO v0.2.0 julia> using PiGPIO;
pigpiod daemon is created, to which To load the PiGPIO library, create the Pi julia> set_mode(U
the Julia script connects. The pigpiod object (p), set the mode of the pin, and p,ledpin, PiGPIO.OUTPUT);
two-button win-
dow that will turn Figure 3 shows the Julia GTK GUI win-
on or turn off a dow toggling GPIO pin 17 (the yellow
GPIO pin. The LED 2 on the Pi Explorer HAT).
Julia syntax looks
somewhat like Py- Julia Web Server
thon or Matlab. Li- As with graphics libraries, you have a
braries are im- few web server options. I found that the
ported by the Genie web framework [3] was one of the
using command easiest to get up and running. Listing 2
(lines 4-5). Func- is an example web app that creates two
tions are defined buttons to turn a GPIO pin on and off
with a function (Figure 4).
statement and This script uses three libraries: Genie
closed with an end for the web server framework, Sockets to
Figure 3: Julia GTK GUI app toggling a Pi GPIO pin. (lines 32-35 and get the Raspberry Pi’s IP address, and
02 # web2gpio.jl - Julia Web Page with routes to turn on/off 28 <button type="submit" formaction="/OFF">OFF</button>
a Pi GPIO pin 29 </form>
03 # 30 </body>
04 using Genie 31 </html>
05 using Sockets # to get IP address 32 """
06 using PiGPIO; 33
07 34 route("/") do
08 # Setup GPIO 35 return html_file
09 p=Pi(); 36 println("Connected")
10 led_pin = 17; 37 end
11 set_mode(p, led_pin, PiGPIO.OUTPUT) 38
12 39 route("/ON?") do
13 # get the Rasp Pi IP address 40 PiGPIO.write(p, led_pin, PiGPIO.HIGH)
14 myip = string(getipaddr() ) 41 return html_file
15 42 println("ON Selected")
16 println("Loading Genie...") 43 end
17 44
18 # create an HTML document 45 route("/OFF?") do
19 html_file = """ 46 PiGPIO.write(p, led_pin, PiGPIO.LOW)
20 <html> 47 return html_file
21 <head> 48 println("OFF Selected")
22 <title>Julie GPIO Test Page</title> 49 end
23 </head> 50
24 <body> 51 # Start the web app on port 8001
25 <h1>Toggle Pi LED</h1> 52 up( 8001, myip, async = false)
26 <form>
19-32). For larger projects I would recom- julia> run(mycmd) ; # run the command $ # enable the execute status U
mend the use of external HTML files. /home/pete/ on the file
(lines 34-49). The first route is the de- The Julia read function runs an external $ # run the script directly
fault (/) page that prints out the html_ command and stores the output string as $ ./bme_280.jl
file variable. The /ON and /OFF routes a variable (Figure 5). Next, a split creates
set the GPIO pin output when either an array of string output from the BME280 Figure 6 shows the BME280 bar chart
the ON or OFF form buttons are data. For this example, the humidity value app with the Raspberry Pi hardware. A
pressed. is the third index (data[3]), and tempera- nice enhancement to this project would
The final line in the script sets up the ture is the fifth index (data[5]). be to save the sensor data to a database
web framework to run on port 8001 with
the Raspberry Pi’s IP address. Listing 3: Plot BME280 Sensor Data
The command 01 # !/snap/bin/julia
02 #
julia web2gpio.jl 03 # bme_280.jl - show BME280 sensor data on a refreshing bar chart
04 #
starts the Julia web script. 05 using Plots
06 using Dates
Plot BME280 Sensor Data 07
For the final example, the goal is to 08 print("Show BME280 readings every 10 seconds. Control-C to exit.\n")
create a dynamic bar chart that shows 09 while true
BME280 temperature and humidity data. 10
The Plots charting library is installed by 11 # get the sensor output, then split it into an array of strings
12 output = read(`read_bme280 --i2c-address 0x77`, String)
julia> Pkg.add("Plots") 14
15 # get the temperature as a Float, (5th item in array)
sor libraries, so for this project a BME280 18 # get the humidity as a Float, (3rd item in array)
19 h = parse(Float64,data[3])
Bash command-line utility is used:
20
21 # define a bar chart
$ # Install a Python based Bash U
22 thetitle = "BME280 Data - " * Dates.format(now(), "H:MM:SS")
command line tool
23 p = bar(["Temperature (C)","Humidity (%)"], [t,h],legend=false,
$ sudo python -m pip install bme280
24 title=thetitle, texts = [t,h], color=[:red, :yellow] )
25 display(p)
BME280 sensors typically use the I2C ad-
26
dress 0x76 or 0x77. The i2cdetect tool
27 sleep(10)
can be used to identify your sensor ad-
28 end
dress by entering
julia> py"""
lcd = LCD()
lcd.text('Hello World!', 1)
lcd.text('Raspberry Pi', 2)
"""
using PyCall
lcd = rpi_lcd.LCD();
or file and then present the results in a extremely handy when you want to com- lcd.text("Hello from Julia", 1)
moving real-time line chart. munication with any external Pi lcd.text("Raspberry Pi!", 2)
hardware.
Julia Calling Python A simple example uses PyCall with a is achieved by referencing a Python ob-
One of the interesting features in Julia 16x2 LCD display. The first step in the ject in Julia (Figure 7).
is that script languages such as Lua, integration is to install Python’s Pi LCD
Matlab, R, and Python can be integrated library: Summary
directly into Julia code. A Raspberry Pi can be a nice platform to
To enable this functionality, the first $ # install the Python Raspberry Pi U use to get to know Julia programming. For
step is to install the required language LCD library Python users, the mixing of Python code
library. To add the Python interface li- $ pip install rpi-lcd into Julia scripts allows common Pi hard-
brary, enter ware to be integrated into your projects.
PyCall works in a couple of different After getting a handle on some of the
$ julia --eval 'using Pkg; U ways. The first is Python code used Julia basics, the next steps might be to
Pkg.add("PyCall")' directly within a Julia script, and the investigate data analysis, Jupyter note-
second is to have Julia code reference books, or AI applications.
Most of the Raspberry Pi sensors and Python objects. An example runs Python It should be noted that playing with
third-party devices are supported in Py- code directly to write text to a two-line Julia on older Raspberry Pi hardware or
thon, so the PyCall library can come in I2C LED screen: using early versions of Julia could run
painfully slow. If speed becomes an
issue, Julia offers a number of packaging
and pre-compilation options to improve
call-up time. Q Q Q
Info
[1] Julia programming language:
https://julialang.org/
[2] Julia downloads:
https://julialang.org/downloads/
[3] Genie web framework:
https://genieframework.github.io/
Genie.jl/dev/
Author
You can investigate more neat projects
by Pete Metcalfe and his daughters at
Figure 7: Using Python libraries and objects in Julia. https://funprojects.blog.
QQQ
MakerSpace
Use your Raspberry Pi as a
home theater server.
And … Action!
The top dogs in the media server space now face some
competition from Jellyfin, a relatively young project that
impresses with a number of innovations. By Erik Bärwaldt
K
odi [1] and Plex [2] are well In contrast to Plex, Jellyfin works
known, mainly because of without forcing you to have a connection
their availability on multi- to a cloud server. The Jellyfin media
ple platforms. Although you server can provide content without an
can easily run Kodi on virtually any Internet connection, which means you
Linux derivative, Plex also supports can use offline and online sources at the
many network-attached storage (NAS) same time. Another Jellyfin advantage is
systems, which you can use as media the license model. The Plex Media
servers while providing huge storage Server vendor sells Plex under what is
capacity for multimedia content. Jelly- known as a freemium model that only
fin [3], which was first released in lets you access the basic functions free
2018 and is derived from the Emby of charge. Extended features are only
media server, also focuses on Linux as available if you have a commercial Plex
a platform but supports the Flatpak Pass, and some apps require users to pay
package management system and an activation fee. Moreover, US-based
Docker container environments, as Plex reserves the right to collect cus-
well. The newcomer is also suitable for tomer data and share it with partners.
use on the Raspberry Pi, meaning you The aim is to serve up advertising in a
can use the small computer as an en- more targeted manner. Jellyfin, on the
ergy- and space-saving media server other hand, is completely free and does
for your home theater. not market any user data.
Like the Plex Media Server, Jellyfin is
a client-server application. Client ap- Downer
plications that retrieve and play con- Jellyfin can transcode video files to make
tent from the server are available for them available if codecs are not avail-
all popular desktop operating systems, able. Various types of hardware accelera-
but also for mobile devices such as tion are used, with a modified version
Lead Image © andre266, 123RF.com
Installation
To begin, install the latest 64-bit version
of Raspberry Pi OS; don’t go for the most
extensive image with a complete suite of
application programs – one of the
smaller images will be fine. Your sys- Figure 1: A graphical wizard simplifies the basic configuration of Jellyfin.
tem’s microSD card needs to support the
fastest possible data transfer standard to your intranet by typing http://<RaspPi servers. Once you have made all the set-
avoid becoming a bottleneck and leading IP>:8096 in a web browser. In the next tings, press OK to confirm. The software
to unnecessary latency when transfer- step, you are taken to a configuration rou- now switches back to the overview and
ring large volumes of data. tine with localization options, if you need displays the newly created library.
After you have installed and updated them. Next, set up a user with authenti- Clicking Next takes you to a dialog in
the operating system, pop up a terminal cation credentials (Figure 1). which you can specify the preferred lan-
after a reboot and integrate the Jellyfin Now, create an initial media library by guage for the metadata to be loaded. As
server into the system. The developers clicking the Add Media Library button. in the previous dialog, simply use the se-
not only provide extensive documenta- In the dialog, specify the type of multi- lection boxes. In the final step, you can
tion for several Linux distributions but media content, how the software should allow direct access to the Jellyfin server
also offer a ready-made script for Debian display the library, and the path to the li- from the Internet from other computers
and Ubuntu users that integrates the Jel- brary. You can also define the preferred or mobile systems. Click Finish to com-
lyfin repository into your system and language in which you want Jellyfin to plete the basic configuration; the routine
carries out the installation [5]: load the content. now loads the metadata of the individual
In the further course of the Settings di- content for the newly created library off
$ curl https://repo.jellyfin.org/U alog, you can use predefined settings to the web.
install-debuntu.sh | sudo bash specify the source from which you will A window opens where you can log in
be fetching metadata for the multimedia to the server. After entering your creden-
The script then completes all the neces- content, such as information on tracks tials, you are taken to your individual
sary steps and launches the Jellyfin or photos of the artists. All you have to user interface. When you get there, you
server. do is check the boxes to target the source will see the libraries you created lined up
When done, use a conventional file
manager or the command line to create a
directory structure for the content to be
backed up locally. Create separate direc-
tories and subdirectories for videos,
audio files, trailers, and so on. Please
bear in mind that Jellyfin distinguishes
between films and series and therefore
may require several directories. To en-
sure that the directories can all be ad-
dressed later in the dialog to be used for
creating libraries, define file access con-
trol lists by entering the command:
$ sudo setfacl U
-m u:jellyfin:rx /<path/to/files>/
horizontally at the top, and the individ- hides the other libraries, and you can create trailers or collection lists, or view
ual content of the libraries appears view the details of the selected collection. individual movies marked as favorites
below (Figure 2). In this view, you can also start playback in an overview. To add a movie to this
by clicking on a track or saved album. At list as a favorite, click on the small
Operations top center in the browser window you heart at the bottom of the preview
Scrolling to the left or right with the will then find a control bar that you can image. In the preview of the individual
arrow keys above each library lets you use to compile a playlist from individual libraries, you can also click on the Fa-
browse the folders. As soon as you tracks, display information about the in- vorites link in the options bar at the top
mouse over the content images, a play dividual artists, or call up a song list. to call up your favorites right away.
button appears in the middle to let you The left arrow at top left in the If you want to see more detailed infor-
play the content. If necessary, the appli- browser window switches back to the mation on individual content, click on
cation transcodes the existing data on main window. If you previously en- the preview of an audio or video file. A
the Raspberry Pi and sends it to the web abled audio file playback, you will neat window then appears showing the
client. The integrated playback software continue to hear the playback on the preview image on the left and detailed
then plays the content in the client com- client computer. Audio playback only information about the movie or audio
puter’s web browser. ends if you switch to a video file or file on the right (Figure 3). A small op-
For audio files, only the playback soft- cancel manually. tions bar at top right lets you start play-
ware controls appear in the lower part of In the video file overview, you will back, add content to your favorites list,
the web browser. If you click on an audio find an options bar at the top that can or – in the case of films – view a trailer.
collection in the library view, Jellyfin be used to display the movie genres, For audio files, you’ll see a track list,
which you can use to play individual
audio tracks from an album.
Settings
Each user has a settings menu to help
them customize the interface. To access
the menu integrated in the control bar,
use the hamburger menu top left in the
browser window. You will now see the
control and admin menu. The top cate-
gory, Media, shows the libraries avail-
able in your installation, and you can
call up the admin menu lower down.
Global system settings are on the
Dashboard page. The software lists the
various configuration options for the
server at top left. Lower down, you can
configure various settings for external
Figure 3: The metadata gives users useful information about the content. devices, but also for troubleshooting. In
the larger window segment on the right,
the dashboard displays the Jellyfin serv-
er’s technical data by default, including
a list of activities on the server, as well
as all actions individual users have per-
formed (Figure 4).
You can create new users and define
their rights in the Users dialog (e.g.,
share only specific libraries with new
users), and you can add extensions to
your installation in the Plugins menu in
the Advanced section. This option lets
you integrate more services (e.g., addi-
tional metadata sources). Jellyfin comes
with a number of plugins already inte-
grated. From the options bar above the
plugin display you can show additional
plugins and integrate repositories into
Figure 4: The dashboard provides a global overview of the system and your installation. Although Jellyfin has
lets you modify the configuration. nowhere near as many add-on modules
at the prompt.
Conclusions
The Jellyfin media server lacks nothing
compared with other established solu-
tions. Thanks to the server, you can
manage and play back your media col-
lection on supported clients across mul-
Figure 5: Extend Jellyfin’s feature set with add-on modules. tiple platforms and in the browser. A
wide variety of sources can be used.
as other media players, you can find the Settings dialog, including options for The ability to integrate services that
useful extensions for some important modifying the look of the GUI, changing provide metadata gives you additional
features, such as implementing other the user profile, and setting configuration interesting facts related to your media.
metadata providers (Figure 5). options for content playback (Figure 6). Jellyfin is particularly respectful of your
The modifications made in the Dash- privacy and is easy on your wallet: The
board dialog have a global effect on the Permanent software is not only free of charge, it
system (i.e., they apply to all users); op- By default, you need to restart the Jelly- avoids collecting personal data. These
tions for individual logged-in users are in fin server every time you boot your features are especially attractive for
anyone interested in serving up audio
and video media. Q Q Q
Info
[1] Kodi: https://kodi.tv
[2] Plex: https://www.plex.tv
[3] Jellyfin: https://jellyfin.org
[4] Hardware accelerators:
https://jellyfin.org/docs/general/
administration/hardware-acceleration/
[5] Installing Jellyfin: https://jellyfin.org/
docs/general/installation/linux
Author
Erik Bärwaldt is a self-employed IT admin
and technical author living in the United
Kingdom. He writes for several IT
Figure 6: Users can customize the GUI in Jellyfin. magazines.
QQQ
MADDOG’S
DOGHOUSE
Non-teachers often underestimate the work of teachers; the
Jon “maddog” Hall is an author,
educator, computer scientist,
and free software pioneer
who has been a passionate
advocate for Linux since 1994
profession merits greater support and funding. BY JON “MADDOG” HALL when he first met Linus Torvalds
and facilitated the port of
Linux to a 64-bit system. He
serves as president of Linux
Teaching International®.
ecently, I participated in a Facebook conversation with Fortran (for the math/science side), and COBOL (for the business
At Your Fingertips
RDAP provides structured information about domains. Besides practical
command-line query tools, there are also libraries for integrating the protocol
into your own programs.
ince the early days of the Internet and its in RFC 812 [1]. The service can be accessed
S
BY FRANK HOFMANN
division into different organizational using a command-line tool of the same name,
areas, users have needed a way to obtain whois. The age and importance of the protocol
information about domains, the IP addresses are evident from, among other things, its very
they use, their owners, and a way of mapping low number – 43 – in the list of standardized
them to each other. The WHOIS protocol was ports for network services [2], which can be
devised for this purpose in 1982 and introduced found in the file /etc/services [3].
% Restricted rights.
% The above data may only be used within the scope of technical or
% problems.
% The use for other purposes, in particular for advertising, is not permitted.
% The DENIC whois service on port 43 doesn't disclose any information concerning
% This information can be obtained through use of our web-based whois service
% http://www.denic.de/en/domains/whois-service/web-whois.html
Domain: linux-community.de
Nserver: eva.ns.cloudflare.com
Nserver: jay.ns.cloudflare.com
Status: connect
Changed: 2021-01-07T10:44:58+01:00
RDAP
The idea is for RDAP to completely replace
WHOIS. It provides information about a domain
via the HTTP protocol. This eliminates the need to
open port 43 in the firewall. RDAP was standard-
ized back in 2015 by an Internet Engineering Task
Force (IETF) working group.
In terms of its objectives, RDAP is similar to
WHOIS in that it enables searching for domain
names, IP addresses, and the Autonomous Sys-
tem Numbers (ASNs) of Internet resources. Put
simply, an ASN references a set of routers that
are part of a shared network and lets users ad-
dress them as a unit, for example, on a content
delivery network (CDN) for web and media Figure 1: This is what the
content. query for linux-magazine.com. The result makes it WHOIS output for an IP
clear that not all the information is easily accessi- address looks like.
Access via the Command Line ble. The text-based WHOIS output also has a dif-
This is where the whois command comes into ferent structure depending on the domain queried.
play. The call includes an IP address (Figure 1) or This makes parsing the output challenging and
a domain name; Listing 2 shows an excerpt of a machine processing more difficult. In contrast,
DNSSEC: unsigned
[...]
example, again, this is linux-magazine.com (line 4). In 02 from rdap import RdapClient
the next step, you create an RdapClient() class ob- 03 # define domain name
ject (line 7), which has a method named get_domain() 04 domainName = "linux-magazine.com"
ously defined domain as a parameter (line 9). The re- 06 # create RdapClient
maining lines are used to evaluate the response re- 07 rdapc = RdapClient()
ceived. They deliver the domain name (line 12), the 08 # send RDAP request
expiry date (lines 15 to 20), the registrar (lines 23 and 09 reply = rdapc.get_domain(domainName)
24), and the stored name servers (lines 27 to 30). 10
The output can be found in Listing 6; the content 11 # extract domain name
matches Listing 4. 12 print("Name:", reply.data["ldhName"].lower())
ipwhois
13
the colorful output of the RDAP record for the heise.de 23 registrar = reply.data["entities"][0]["vcardArray"][1][1][3]
domain, which was passed in here in the form of the 24 print ("Registrar:", registrar)
matching IPv6 address. ipwhois_utils.py provides 25
many auxiliary functions, such as the ability to com- 26 # extract nameservers
pute CIDR ranges and list currently unused IP 27 nameservers = reply.data["nameservers"]
addresses. 28 for entry in nameservers:
Listing 7 demonstrates how to use the ipwhois
29 name = entry["ldhName"].lower()
library. Lines 1 to 3 first integrate two modules,
30 print("Nameserver:", name)
TTVMRXɄ?AJSVRMGIVSYXTYXERHMT[LSMWJSVEGGIWWMRK
'asn_registry': 'ripencc',
12 pprint(reply) [...]
Figure 3: Ipwhois provides RDAP output for an IPv6 address via a sample script.
the WHOIS protocol. You only need the IPWhois the query to the first level. The result is then out-
class from ipwhois. Line 6 defines the IP ad- put in line 12 using pprint(). The result in List-
dress to be queried; in our case, the IPv6 ad- ing 8 shows all publicly available information
dress of the heise.de domain. The call in line 8 on the requested IP address. There are no lim-
creates a suitable query object. its to further utilization of the data in your own
Figure 4: WHOIS output for The actual RDAP request takes place in tools.
a domain in the graphical PMRIɄYWMRKXLIlookup_rdap() method. The
gnome-nettool front end. depth=1 parameter limits the recursion depth for Conclusions
The way the information for a domain or IP ad-
dress is displayed is ultimately disappointing
due to the structure of the response. We are not
aware of a graphical front end capable of pro-
viding substantially better output than the text
format.
Gnome-nettool [18] (see Figure 4 for the heise.
de domain) calls the WHOIS tool in the back-
ground and drops its output into a text window
without color highlighting or structuring. The tool
does not support RDAP right now.
The Who’s Who web service cuts a far better
figure. The spelling is the original domain
name, supplemented in each case by .whoswho,
(i.e., “Who’s Who” without spaces and apostro-
phes). Figure 5 shows a section of the infor-
mation accessible for the linux-magazine.com
domain.
As the source code snippets from this article
make clear, queries for information on a domain
or IP address can be implemented with compar-
atively little effort. The required program code is
manageable, especially thanks to libraries that
take care of a major portion of the communica-
tion task involved. Q Q Q
Info
QQQ
Scan Doctor
Use this handy tool to make your scanned PDF files zoomable and searchable.
he Portable Document Format (PDF) is author chose. However, it will not show the formu-
T
BY MARCO FIORETTI
surely one of the very few file formats las that generate those numbers and, just like its
that every user of digital devices has printed version, it will not allow you to, for exam-
used at least once, knowingly or not. In spite of ple, reorder the rows in any way.
its popularity, however (or maybe because of These limits, which were embedded in the design
it?), the PDF is also too often misused. This article and mission of the PDF standard on day one, are
describes the problem and how you can solve it well known and don’t make it less useful. The mis-
with a tool called pdfsandwich [1]. use mentioned earlier comes from ignoring, or for-
getting, that scanners create images, but comput-
Purpose and Limits of PDF ers represent text and images in two very different
In general, and simplifying a lot, the purpose of ways, which I have summarized in the “How Com-
PDF is to make sure that the visible and print- puters Manage Text and Images” box.
able versions of a digital document always look A direct consequence of this difference is that
the same on both screen and paper, no matter PDF files directly generated by word processors,
which hardware and software render it. The spreadsheets, and other software programs are
PDF version of a text file configured to print on very different from the PDF files that could be gen-
A4 sheets, for example, will always keep that erated by printing those documents and then di-
form factor, regardless of that of the screen rectly scanning the printouts.
that displays it. PDF files of the first kind preserve the encoding
For the same reason, the PDF version of a of each character, letter or not, and this allows
spreadsheet will surely show the numbers in each their content to be searched or processed auto-
cell with whatever font, font face, and color its matically as native text, as shown in Figure 1.
Listing 1: Making pdfsandwich Work After changing two lines of that file as shown in
ORIGINAL LINES:
Listing 1 (remember to make a copy of the original
<!-- <policy domain="module" rights="none" pattern="{PS,PDF,XPS}" /> -->
first!), the errors went away, and pdfsandwich
generated the file shown in Figure 4, called
<policy domain="coder" rights="none" pattern="PDF" /> trauma-of-materialism_ocr.pdf.
As you can see, while the appearance is the
EDITED LINES: same as in Figure 3, there is a really big differ-
<policy domain="module" rights="read|write" pattern="{PS,PDF,XPS}" /> ence, and it is just what we wanted. After using
pdfsandwich, you can search for any word and
< <policy domain="coder" rights="read|write" pattern="PDF" /> all its occurrences will appear in the sidebar;
you can also right-click to copy and paste all
the text you want.
all you need to convert them is what I did with Please note that, depending of the quality of
the file of Figure 3. You pass their names to pdf- both the original text and of the scanning process,
sandwich on the command line, without any the text you select may require some adjustment
options: before you can reuse it elsewhere. For example, if
the text has spelling errors because the OCR pro-
#> pdfsandwich trauma-of-materialism.pdf cess failed to correctly recognize some character,
you will have to correct them yourself. If a word is
The first time I did this on Ubuntu 22.04 LTS, hyphenated, it will remain split in two parts after
however, I got a lot of errors like this: the conversion but without the hyphen, and you
will have to fix that yourself, too.
identify-im6.q16: attempt to perform an operationU Also, when the text is on multiple columns, lines
not allowed by the security policy `PDF' U of different columns may be merged into one.
@ error/U Even with these limitations, however, a PDF file
constitute.c/IsCoderAuthorized/421. processed by pdfsandwich is infinitely more us-
able than it was before. The price to pay is an in-
A quick online search showed that I had run into crease in size, because by default the text is
the same misconfiguration issue already reported placed below the original images, instead of re-
by many ImageMagick users [7]: The default con- placing them. As an example, pdfsandwich made
figuration of that suite, defined in the file /etc/Im- the 16-page file (shown in Figure 3) 8.5 percent
ageMagick-6/policy.xml, does not include read or bigger, which, if you ask me, is an excellent deal
write access to PDF files. considering how much you gain.
This issue, which I had never noticed before, While I am sure that you fill find pdfsandwich’s
prevents pdfsandwich from generating from the work quite satisfying in most cases, it is impor-
original PDF the separate image files that Tes- tant to remember that it can’t do miracles. The
seract could load and convert to text. Luckily, quality of pdfsandwich’s results depends very
the fix is easy. much on the quality of the scanning process.
As a real-world proof of how important it is to
scan well, imagine a very common case in many
households: scanning some old, maybe pretty
thick, magazines before tossing them away, be-
cause you absolutely need to free some shelf
space but want to keep their content, and maybe
need to reuse the text.
Figure 5 contains a side-by-side comparison of
a scan of my recent Linux Magazine datamash ar-
ticle [8] on the left, and the pdfsandwich transfor-
mation of the scan on the right. Taken together,
those two files are a great example of how things
may go wrong (or worse than expected) if you
combine two very common errors everybody
makes at least once (myself included!) when
scanning magazines.
The original scan which was saved as the PDF
file on the left was made with the default settings
of the Simple Scan tool for scanning text. As ex-
Figure 4: The same file in Figure 3, after being processed by pdfsandwich, becomes fully pected, in that file it’s not possible to select text,
searchable! because its content contains no actual text, just
Figure 5: A page of Linux Magazine scanned and saved as PDF has no selectable text (1) until it’s passed to pdfsandwhich (2).
images of whole pages. This is why trying to se- original paper document, and text layout options
lect text yields the little page icon (1 in Figure 5) that it is impossible to give one set of tips that
instead of highlighting any word. would work in every case. However, this is not a
The version of the same file generated by pdf- limitation of pdfsandwich or a big deal in any way.
sandwich, on the right, does contain actual text that By this, I mean that on one hand, if you want to
can be selected with the mouse (2 in Figure 5) but convert image PDFs that you didn’t scan yourself
is somewhat disappointing, isn’t it? Half the text of (where all the damage was already done by some-
the rightmost column is missing! one else), you must just live with it. On the other
The reasons for these suboptimal results are hand, if you are scanning the documents you
just the two errors I already mentioned: First, I want to pass to pdfsandwich, you must just be
chose the scan for text option of Simple Scan,
which produces smaller files, but with reduced
quality, which is less suitable for OCR programs.
Then, I did not align the paper page properly to the
edge of the scanner plate and didn’t press on all of
its sides the same way. The rightmost column was
unaligned and not touching the plate because of
these mistakes. As a result, the corresponding part
of the image was too poor quality for unpaper and
Tesseract to recognize much of it.
Things go much better if you take care of align-
ing the page edges to those of the scanner –
make sure all of the page lies flat on the plate and
then set Simple Scan to scan as image, with colors
and all. As shown in Figure 6, the resulting image
PDF file is of much better quality, which in turn
makes the job of pdfsandwich much easier, as
you can see in Figure 7.
It’s still not possible to select text from only one
column, and if you look carefully at the rightmost
column, you will notice that it was not possible to
select the “G” drop cap, because evidently it wasn’t
recognized as a letter. But it’s a great improve-
ment over Figure 6 all the same.
In practice, there are so many combinations of Figure 6: The same page seen in Figure 5 has much better results when scanned as an
scanners, scanning software, quality levels of the image, not text, and properly aligned to the scanner.
QQQ
After 88 issues, and over 1,000 individual picks, this is Graham’s last
FOSSPicks after finally deciding to pass the baton to the next
generation of FOSS-picking practitioners. BY GRAHAM MORRISON
Polyphonic sound generation
Firefly Synth
his wouldn’t be FOSSPicks interface, making it all look more oscillators. As with all the duplicate elements in the synth
envio
nvironment variables are even store your environment
Synth bass
JC303
ven if you don’t care Roland’s other famous drum
E about synthesizers,
there’s one that you can’t
escape, and that’s the acid-like
machine, the TR-808, creating a
dance music craze that modern
music is still driven by.
squelch of a Roland TB-303 With prices for the original TB- JC303, via Open303, rec- control, multiplatform and multi-
bass-line synth. This synth was 303 remaining high, and even reates all the analog architecture support, and a fan-
packaged in a small silver box modern recreations missing the nuances of the original tastic canvas for modern inter-
and originally released in the nuances of the original sound, TB-303, including its face designs with knob and but-
very early 1980s. Roland built software developers have long quirky design and form ton interaction. For JC303, it
the TB-303 to replace human attempted to recreate the origi- factor. brings support for LV2 and CLAP,
bass players, but just like its nal sound in code, and an open for instance, alongside VST2 and
CR-78 “CompuRhythm” drum source project called Open303 VST3, and there are plans to add
machine and drummers, it was got about as close as you could both preset and a recreation of
instead instrumental in creat- get. JC303 is a new port of that the original step sequencer. The
ing an entirely new genre of original tried and tested Open303 sequencer was an integral part
electronic music, leaving sala- code to the JUCE framework, of the original sound, and JUCE
ried musicians safely alone. bringing with it a modernized GUI will make its recreation a lot eas-
The TB-303 in particular be- and much better software inte- ier than having to code some-
came the definitive sound of gration capabilities. Many open thing locked to a GUI entirely
acid house music in the late source plugins now use JUCE from scratch.
‘80s, where its monophonic and because it offers modern MIDI
resonant sound energy compatibility with audio software Project Website
matched perfectly with integration, but also remote https://github.com/midilab/jc303
Music platform been through a cou- generate the site, you run
the faircamp executable
Faircamp
ple of tough acquisi-
tions, resulting in un- within a strictly structured
certainty and mass music library, and the bi-
layoffs, and this in nary renders audio files to
e don’t often look at turn has motivated various formats and bi-
Gnome Web
menus in the title bar
and keeping onscreen
decoration to an ab-
ozilla has been making other alternatives solute minimum. It’s
Scrutiny
tial if your local storage is hidden
away in some loft space and sel-
dom thought about.
ow that we’ve mostly Scrutiny provides a beautiful,
Video editor
Flowblade
considerable number of sections out from any place,
Editing with Flowblade is a great experience. After so long in development, it’s also a very stable one with Project Website
predictable output and processing time. https://jliljebl.github.io/flowblade/
The
from small menus. While it
doesn’t have the 3D graphics of
Elite or Frontier, it does have sim-
Masters
local systems according to your
ship’s range before scanning
local systems for minerals or
f you’re still looking for more meeting the neighborhood’s local
arcodes and QR codes abound in modern Zint in some distributions’ repositories, but it is
B
BY DANIEL TIBI
life. You can find them on products at an older version.
the supermarket, train and plane tickets, After completing the install, you can launch the
concert and movie tickets, and letters and par- program’s graphical user interface (GUI) by press-
cels. With a barcode or QR code, you can easily ing the button or opening a command line and
pass on short text, contact data, URLs, or WiFi typing zint-qt or zint-snap (if you installed the
access credentials, which can then be quickly Snap package).
scanned and processed with the right choice of To run Zint in the shell, just type the zint com-
app on a smartphone or tablet. Zint Barcode mand. If you prefer typing to clicking, you can
Generator [1] lets you easily make y our own type zint --help to display a list of commands. I
barcodes or QR codes. recommend taking a quick look at the online Zint
[4] manual to familiarize yourself with the
Installation
To get started, you’ll want to grab the current ver-
sion, Zint v2.13, from December 18, 2023. You
can build the software yourself from the source
code [2], which you’ll find along with some in-
structions on SourceForge. Alternatively, you can
use the prebuilt Snap package [3]. You may find
Figure 1: The Zint GUI shows you a preview of the code Figure 2: QR codes make it easy to open links without
(top) with tabs below for entering data. having to type in the URL.
Data Fields
Underneath the code preview, you’ll find three
tabs for entering data to customize your bar-
code or QR code. In the first tab, Data, you enter
the data that you want to display as code in the
Data to encode field. Depending on the desired
Info
[1] Zint Barcode Generator: https://zint.org.uk
[2] Zint on SourceForge:
https://sourceforge.net/projects/zint/files/
Figure 7: Zint can export your code to most popular graphic
[3] Snap package: https://snapcraft.io/zint-snap
formats. You can either save the results or copy them to the
clipboard. [4] Online manual: https://zint.org.uk/manual/
QQQ
SERVICE
Back Issues
LINUX
NEWSSTAND
Order online:
https://bit.ly/Linux-Magazine-catalog
Linux Magazine is your guide to the world of Linux. Monthly issues are packed with advanced technical
articles and tutorials you won't find anywhere else. Explore our full catalog of back issues for specific
topics or to complete your collection.
#281/April 2024
Virtual Memory
The classic vision of random access memory is just the beginning of the story. Modern hardware –
and modern operating systems – manage memory in ways that old-school programmers could
only have imagined. This month we take a look at virtual memory in Linux.
On the DVD: elementary OS 7.1 and Mageia 9
#280/March 2024
Plasma 6
KDE’s classic Plasma desktop can be as simple as you need it to be or as complicated as you
want to make it. This month we explore the powerful Plasma 6 release that is making its way
to your Linux distribution.
On the DVD: Linux Mint 21.3 MATE and Zorin OS 17 Core
#279/February 2024
Intrusion Detection
You don’t need a fancy appliance to watch for intruders – just Suricata and a Raspberry Pi.
On the DVD: EndeavourOS Galileo 11 and Arch Linux 2023.12.01
#278/January 2024
Scientific Computing
A crypto mining rig is built for math. Can an old rig find a second life solving science problems?
That all depends on the problem. Also this month, we explore a few popular data analysis
techniques and stir up some analysis of our own with the R programming language.
On the DVD: Kubuntu 23.10 and Fedora 39
#277/December 2023
Low-Code Tools
Experienced programmers are hard to find. Wouldn’t it be nice if subject matter experts and
occasional coders could create their own applications? The low-code revolution is all about
lowering the bar for programming knowledge. This month we show you some tools that let
you assemble an application using easy graphical building blocks.
On the DVD: MX Linux MX-23_x64 and Kali Linux 2023.3
#276/November 2023
ChatGPT on Linux
Everybody’s talking about ChatGPT, and ChatGPT is talking about everything. Sure you can
access the glib and versatile AI chatbot from a web interface, but think of the possibilities if
you tune in from the Linux command line.
On the DVD: Rocky Linux 9.2 and Debian 12.1
FEATURED EVENTS
Users, developers, and vendors meet at Linux events around the world.
We at Linux Magazine are proud to sponsor the Featured Events shown here.
For other events near you, check our extensive events calendar online at
https://www.linux-magazine.com/events.
If you know of another Linux event you would like us to add to our calendar,
please send a message with all the details to [email protected].
Events
DrupalCon Portland 2024 May 6-9 Portland, Oregon https://events.drupal.org/portland2024
ISC 2024 May 12-16 Hamburg, Germany https://www.isc-hpc.com/
DORS/CLUC May 15-19 Zagreb, Croatia https://www.dorscluc.org/
PyCon US 2024 May 15-23 Pittsburgh, Pennsylvania https://us.pycon.org/2024/
Icinga Summit 2024 June 5-6 Berlin, Germany https://icinga.com/summit/
CloudFest USA June 5-8 Austin, Texas https://www.cloudfest.com/usa/
stackconf June 18-19 Berlin, Germany https://stackconf.eu/
Open South Code June 21-22 Málaga, Spain https://www.opensouthcode.org/conferences/
opensouthcode2024
Design Automation Conference June 23-27 San Francisco, California https://www.dac.com/
useR! July 8-11 Salzburg, Austria and Virtual https://events.linuxfoundation.org/user/
GUADEC 2024 July 19-24 Denver, Colorado https://foundation.gnome.org/2023/12/20/
guadec-2024-in-denver-colorado/
Akademy 2024 Sep 7-12 Würzburg, Germany + Online https://akademy.kde.org/2024/
Open Source Summit Europe Sep 16-18 Vienna, Austria https://events.linuxfoundation.org
Images © Alex White, 123RF.com
Contact Info
WRITE FOR US
Editor in Chief Linux Magazine is looking for authors to write articles on Linux and the
Joe Casad, [email protected] tools of the Linux environment. We like articles on useful solutions that
Copy Editors
Amy Pettle, Aubrey Vaughn
solve practical problems. The topic could be a desktop tool, a command-
News Editors line utility, a network monitoring application, a homegrown script, or
Jack Wallen, Amber Ankerholz anything else with the potential to save a Linux user trouble and time.
Editor Emerita Nomadica Our goal is to tell our readers stories they haven’t already heard, so we’re
Rita L Sooby
especially interested in original fixes and hacks, new tools, and useful ap-
Managing Editor
Lori White plications that our readers might not know about. We also love articles on
Localization & Translation advanced uses for tools our readers do know about – stories that take a
Ian Travis traditional application and put it to work in a novel or creative way.
Layout
Lori White We are currently seeking articles on the following topics for upcoming
Cover Design cover themes:
Dena Friesen
Cover Image • Open hardware
© smon, 123RF.com
• Linux boot tricks
Advertising
Brian Osborn, [email protected] • Best browser extensions
phone +49 8093 7679420
Marketing Communications
Let us know if you have ideas for articles on these themes, but keep in
Gwen Clark, [email protected] mind that our interests extend through the full range of Linux technical
Linux New Media USA, LLC topics, including:
4840 Bob Billings Parkway, Ste 104
Lawrence, KS 66049 USA Security
•
Publisher
Brian Osborn • Advanced Linux tuning and configuration
Customer Service / Subscription • Internet of Things
For USA and Canada: Networking
Email: [email protected]
•
Phone: 1-866-247-2802 • Scripting
(Toll Free from the US and Canada) Artificial intelligence
•
For all other countries: • Open protocols and open standards
Email: [email protected]
www.linux-magazine.com If you have a worthy topic that isn’t on this list, try us out – we might be
While every care has been taken in the content of the interested!
magazine, the publishers cannot be held responsible
for the accuracy of the information contained within Please don’t send us articles about products made by a company you
it or any consequences arising from the use of it. The
use of the disc provided with the magazine or any work for, unless it is an open source tool that is freely available to every-
material provided on it is at your own risk. one. Don’t send us webzine-style “Top 10 Tips” articles or other superfi-
Copyright and Trademarks © 2024 Linux New Media
USA, LLC.
cial treatments that leave all the work to the reader. We like complete so-
No material may be reproduced in any form lutions, with examples and lots of details. Go deep, not wide.
whatsoever in whole or in part without the written
permission of the publishers. It is assumed that all
Describe your idea in 1-2 paragraphs and send it to: [email protected].
correspondence sent, for example, letters, email, Please indicate in the subject line that your message is an article proposal.
faxes, photographs, articles, drawings, are supplied
for publication or license to third parties on a non-
exclusive worldwide basis by Linux New Media USA,
LLC, unless otherwise stated in writing.
Linux is a trademark of Linus Torvalds.
All brand or product names are trademarks of their Authors
respective owners. Contact us if we haven’t cred-
ited your copyright; we will always correct any
oversight. Erik Bärwaldt 64 Frank Hofmann 72
Printed in Nuremberg, Germany by Kolibri Druck. Chris Binnie 32 Vincent Mealing 69
Distributed by Seymour Distribution Ltd, United
Kingdom Zack Brown 12 Pete Metcalfe 60
Represented in Europe and other territories by:
Sparkhaus Media GmbH, Bialasstr. 1a, 85625 Bruce Byfield 6, 44 Graham Morrison 84
Glonn, Germany.
Linux Magazine (Print ISSN: 1471-5678, Online Joe Casad 3 Amy Pettle 26
ISSN: 2833-3950, USPS No: 347-942) is published
monthly by Linux New Media USA, LLC, and dis- Mark Crutch 69 Mike Schilli 52
tributed in the USA by Asendia USA, 701 Ashland
Ave, Folcroft PA. Application to Mail at Periodicals Marco Fioretti 38, 78 Daniel Tibi 90
Postage Prices is pending at Philadelphia, PA and
additional mailing offices. POSTMASTER: send ad-
William F. Gilreath 48 Jack Wallen 8
dress changes to Linux Magazine, 4840 Bob Billings
Jon “maddog” Hall 71 Michael Williams 16
Parkway, Ste 104, Lawrence, KS 66049, USA.
Artificial
Intelligence
It is no secret that AI is finding its way
into everyday life. Next month we show
you some of the ways it is finding its
way into the Linux environment.
Preview Newsletter
The Linux Magazine Preview is a monthly email
newsletter that gives you a sneak peek at the next
issue, including links to articles posted online.
Sign up at: https://bit.ly/Linux-Update