2019-05-28每周学习笔记
续上周:
- In Bash scripts, subshells (written with parentheses) are convenient ways to group commands. A common example is to temporarily move to a different working directory, e.g.
# do something in current dir
(cd /some/other/dir && other-command)
# continue in orignial dir
-
In Bash, note there are lots of kinds of variable expansion. Checking a variable exitsts:
${name:?error message}. For example, if a Bash script requires a single argument, just writeinput_file=${1:?usage: $0 input_file}. Using a default value if a variable is empty:${name:-default}. If you want to have an additional (optional) parameter added to the previous example, you can use something likeoutput_file=${2:-logfile}. If$2is omitted and thus empty,output_filewill be set tologfile. Arithmetic expansion:i=$(( (i+1) % 5 )). Sequences:{1..10}. Trimming of strings:${var%suffix}and${var#prefix}. For example ifvar=foo.pdf, thenecho ${var%.pdf}.txtprintsfoo.txt. -
Brace expansion using
{...}can reduce having to re-type similar text and automate combinations of items. This is helpful in examples likemv foo.{txt,pdf} some-dir(which moves both files),cp somefile{,.bak}(which expands tocp somefile somefile.bak) ormkdir -p test-{a,b,c}/subtest-{1,2,3}(which expands all possible combinations and creates a directory tree). Brace expansion is performed before any other expansion. -
The order of expansions is: brace expansion; tilde expansion, parameter and variable expansion, arithmetic expansion, and command substitution (done in a left-to-right fashion); word splitting; and filename expansion. (For example, a range like
{1..20}cannot be expressed with variables using{$a..$b}. Useseqor aforloop instead, e.g.,seq $a $borfor((i=a; i<=b; i++)); do ...; done.) -
The output of a command can be treated like a file via
<(some command)(known as process substitution). For example, compare local/etc/hostswith a remote one:
diff /etc/hosts <(ssh somehost cat /etc/hosts)
- When writing scripts you may want to put all of your code in curly braces. If the closing brace is missing, your script will be prevented from executing due to a syntax error. This makes sense when your script is going to be downloaded from the web, since it prevents partially downloaded scripts from executing:
{
# Your code here
}
- A "here document" allows redirection of multiple lines of input as if from a file:
cat <<EOF
input
on multiple lines
EOF
-
In Bash, redirect both standard output and standard error via:
some-command >logfile 2>&1orsome-command &>logfile. Often, to ensure a command does not leave an open file handle to standard input, tying it to the terminal you are in, it is also good practice to add</dev/null. -
Use
man asciifor a good ASCII table, with hex and decimal values. For general encoding info,man unicode,man utf-8, andman latin1are helpful.
-Use screen or [tmux](https://tmux.github.io) to multiplex the screen, especially useful on remote ssh sessions and to detach and re-attach to a session. byobu can enhance screen or tmux by providing more information and easier management. A more minimal alternative for session persistence only is [dtach](https://github.com/bogner/dtach).
-
In ssh, knowing how to port tunnel with
-Lor-D(and occasionally-R) is useful, e.g. to access web sites from a remote server. -
It can be useful to make a few optimizations to your ssh configuration; for example, this
~/.ssh/configcontains settings to avoid dropped connections in certain network environments, uses compression (which is helpful with scp over low-bandwidth connections), and multiplex channaels to the same server with a local control file:
TCPKeepAlive=yes
ServerAliveInterval=15
ServerAliveCountMax=6
Compression=yes
ControlMaster auto
ControlPath /tmp/%r@%h:%p
ControlPersist yes
-
A few other options relevant to ssh are security sensitive and should be enables with care, e.g. per subnet or host or in trusted networks:
StrictHostKeyChecking=no,ForwardAgent=yes -
Consider
[mosh](https://mosh.mit.edu)an alternative to ssh that uses UDP, avoiding dropped connecitons and adding convenience on the road (requires server-side setup). -
To get the permissions on a file in octal form, which is useful for system configuration but not available in
lsand easy to bungle, use something like
stat -c '%A %a %n' /etc/timezone
-
For interactive selection of values from the output of another command, use
[percol](https://github.com/mooz/percol)or[fzf](https://github.com/junegunn/fzf). -
For interaction with files based on the output of another command (like
git), usefpp(PathPicker). -
For a simple web server for all files in the current directory (and subdirs), available to anyone on your network, use:
python -m SimpleHTTPServer 7777(for port 7777 and Python 2) andpython -m http.server 7777(for port 7777 and Python 3). -
For running a command as another user, use
sudo. Defaults to running as root; use-uto specify another user. Use-ito login as that user (you will be asked for your password). -
For switching the shell to another user, use
su usernameorsu - username. The latter with "-" gets an environment as if another user just logged in. Omitting the username defaults to root. You will be asked for the password of the user you are switching to. -
Know about the 128K limit on command lines. This "Argument list too long" error is common when wildcard matching large numbers of files. (When this happens alternatives liek
findandxargsmay help.) -
For a basic calculator (and of course access to Python in general), use the
pythoninterperter. For example,
>>> 2+3
5
Processing files and data
-
To locate a file by name in the current directory,
find . -iname '*something*'(or similar). To find a file anywhere by name, uselocate something(but bear in mindupdatedbmay not have indexed recently created files). -
For general searching through source or data files, there are several options more advanced or faster than
grep -r, including (in rough order from older to newer)[ack](https://github.com/beyondgrep/ack2),[ag](ttps://github.com/ggreer/the_silver_searcher)("the silver searcher"), and[rg](https://github.com/BurntSushi/ripgrep)(ripgrep). -
To convert HTML to text:
lynx -dump -stdin -
For Markdown, HTML, and all kinds of document conversion, try
[pandoc](http://pandoc.org/). For example, to convert a Markdown document to Word format:pandoc README.md --from markdown --to docx -o temp.docx -
If you must handle XML,
xmlstarletis old but good. -
For JSON, use
[jq](http://stedolan.github.io/jq/). For interactive use, also see[jid](https://github.com/simeji/jid)and[jiq](https://github.com/fiatjaf/jiq). -
For YAML, use
[shyaml](https://github.com/0k/shyaml). -
For Excel or CSV files,
[csvkit](https://github.com/onyxfish/csvkit)providesin2csv,csvcut,csvjoin,csvgrep, etc. -
For Amazon S3,
[s3cmd](https://github.com/s3tools/s3cmd)is convenient and[s4cmd](https://github.com/bloomreach/s4cmd)is faster. Amazon's[aws](https://github.com/aws/aws-cli)and the improved[saws](https://github.com/donnemartin/saws)are essential for other AWS-related tasks. -
Know about
sortanduniq, including uniq's-uand-doptions -- see one-liners below. See alsocomm. -
Know about
cut,paste, andjointo manipulate text files. Many people usecutbut forget aboutjoin. -
Know about
wcto count newlines (-l), characters (-m), words (-w) and bytes(-c). -
Know about
teeto copy from stdin to a file and also to stdout, as inls -al | tee file.txt. -
For more complex calculations, including grouping, reversing fields, and statistical calculations. consider
[datamash](https://www.gnu.org/software/datamash/). -
Know that locale affects a lot of command line tools in subtle ways, including sorting order (collation) and performance. Most Linux installations will set
LANGor other locale variables to a local setting like US English. But be aware sorting will change if you change locale. And know i18n routines can make sort or other commands run many times slower. In some situations (such as the set operations or uniqueness operations below) you can safely ignore slow i18n routines entirely and use traditional byte-based sort order, usingexport LC_ALL=C. -
You can set a specific command's environment by prefixing its invocation with the environment variable settings, as in
TZ=Pacific/Fiji date. -
Know basic
awkandsedfor simple data munging. See[One-liners](#one-liners)for examples. -
To replace all occurrences of a string in place, in one or more files:
perl -pi.bak -e 's/old-string/new-string/g' my-files-*.txt
- To rename multiple files and/or search and replace within files, try
[repren](https://github.com/jlevy/repren). (In some cases therenamecommand also allows multiple renames, but be careful as its functionality is not the same on all Linux distributions.)
# Full rename of filenames, directories, and contents foo -> bar:
repren --full --preserve-case --from foo --to bar .
# Recover backup files whatever.bak -> whatever:
repren --renames --from '(.*)\.bak' --to '\1' *.bak
# Same as above, using rename, if available:
rename 's/\.bak$//' *.bak
- As the man page says,
rsyncreally is a fast and extraordinarily versatile file copying tool. It's known for synchronizing between machines but is equally useful locally. When security restrictions allow, usingrsyncinstead ofscpallows recovery of a transfer without restarting from scratch. It also is among the fastest ways to delete large numbers of files:
mkdir empty && rsync -r --delete empty/ some-dir && rmdir some-dir
-
For monitoring progress when processing files, use
[pv](http://www.ivarch.com/programs/pv.shtml),[pycp](https://github.com/dmerejkowsky/pycp),[pmonitor](https://github.com/dspinellis/pmonitor),[progress](https://github.com/Xfennec/progress),rsync --progress, or, for block-level copying,dd status=progress. -
Use
shufto shuffle or select random lines from a file. -
Know
sort's options. For numbers, use-n, or-hfor handling human-readable numbers (e.g. fromdu -h). Know how keys work (-tand-k). In particular, watch out that you need to write-k1,1to sort by only the first field;-k1means sort according to the whole line. Stable sort (sort -s) can be useful. For example, to sort first by field 2, then secondarily by field 1, you can usesort -k1,1 | sort -s -k2,2. -
If you ever need to write a tab literal in a command line in Bash (e.g. for the -t argument to sort), press ctrl-v [Tab] or write
$'\t'(the latter is better as you can copy/paste it). -
The standard tools for patching source code are
diffandpatch. See alsodiffstatfor summary statistics of a diff andsdifffor a side-by-side diff. Notediff -rworks for entire directories. Usediff -r tree1 tree2 | diffstatfor a summary of changes. Usevimdiffto compare and edit files. -
For binary files, use
hd,hexdumporxxdfor simple hex dumps andbvi,hexeditorbiewfor binary editing. -
Also for binary files,
strings(plusgrep, etc.) lets you find bits of text. -
For binary diffs (delta compression), use
xdelta3. -
To convert text encodings, try
iconv. Oruconvfor more advanced use; it supports some advanced Unicode things. For example:
# Displays hex codes or actual names of characters (useful for debugging):
uconv -f utf-8 -t utf-8 -x '::Any-Hex;' < input.txt
uconv -f utf-8 -t utf-8 -x '::Any-Name;' < input.txt
# Lowercase and removes all accents (by expanding and dropping them):
uconv -f utf-8 -t utf-8 -x '::Any-Lower; :: Any-NFD; [:Nonspacing Mark:]>; ::Any-NFC;' < input.txt > output.txt
-
To split files into pieces, see
split(to split by size) andcsplit(to split by a pattern). -
Date and time: To get the current date and time in the helpful ISO 8601 format, use
date -u +"%Y-%m-%dT%H:%M:%SZ"(other options are problematic). To manipulate date and time expressions, usedateadd,datediff,strptimeetc. fromdeteutils. -
Use
zless,zmore,zcat, andzgrepto operate on compressed files. -
File attributes are settable via
chattrand offer a lower-level alternative to file permissions. For example, to protect against accidental file deletion the immutable flag:sudo chattr +i /critical/directory/or/file -
Use
getfaclandsetfaclto save and restore file permissions. For example:
getfacl -R /some/path > permission.txt
setfacl --restore=permissions.txt
- To create empty files quickly, use
truncate(creates sparse file),fallocate(ext4, xfs, btrfs and ocfs2 filesystems),xfs_mkfile(almost any filesystems, comes in xfsprogs package),mkfile(for Unix-like systems like Solaris, Mac OS).
System debugging
-
For web debugging,
curlandcurl -Iare handy, or theirwgetequivalents, or the more modernhttpie. -
To know current cpu/disk status, the classic tools are
top(or the betterhtop),iostat, andiotop. Useiostat -mxz 15for basic CPU and detailed per-partition disk stats and performance insight. -
For network connection details, use
netstatandss. -
For a quick overview of what's happening on a system,
dstatis especially useful. For broadest overview with details, useglances. -
To know memory status, run and understand the output of
freeandvmstat. In particular, be aware the "cached" value is memory held by the Linux kernel as file cache, so effectively counts toward the "free" value. -
Java system debugging is a different kettle of fish, but a simple trick on Oracle's and some other JVMs is that you can run
kill -3 <pid>and a full stack trace and heap summary(including generational garbage collection details, which can be highly informative) will be dumped to stderr/logs. The JDK'sjps,jstat,jstack,jmapare useful. SJK tools are more advanced. -
Use
mtras a better traceroute, to identify network issues. -
For looking at why a disk is full,
ncdusaves time over the usual commands likedu -sh *. -
To find which socket or process is using bandwidth, try
iftopornethogs. -
The
abtool (comes with Apache) is helpful for quick-and-dirty checking of web server performance. For more complex load testing, trysiege. -
For more serious network debugging,
wireshark,tshark, orngrep. -
Know about
straceandltrace. These can be helpful if a program is failing, hanging, or crashing, and you don't know why, or if you want to get a general idea of perfomance. Note the profiling option (-c), and the ability to attach to a running process (-p). Use trace child option (-f) to avoid missing important calls. -
Know about
lddto check shared libraries etc - but never run it on untrusted files. -
Know how to connect to a running process with
gdband get its stack traces. -
Use
/proc. It's amazingly helpful sometimes when debugging live problems. Examples:/proc/cpuinfo,/proc/meminfo,/proc/cmdline,/proc/xxx/cwd,/proc/xxx/exe,/proc/xxx/fd/,/proc/xxx/smaps(wherexxxis the process id or pid).