grep awk sed
vmstat的输出如下,空格分割的。sed命令sed 's/^ *//;s/ */,/g' vmstat.dat > vmstat.02.csv
做了两件事:
1. 将行首的空格去掉
2. 将其他空格转成','符合csv的格式
多个sed命令用';'分隔,且共用/g命令
procs -----------memory---------- ---swap-- -----io---- -system-- -------cpu-------
r b swpd free buff cache si so bi bo in cs us sy id wa st gu
1 0 999208 3603968 353020 8258580 0 0 22 166 1016 2 4 1 94 0 0 0
处理后
procs,-----------memory----------,---swap--,-----io----,-system--,-------cpu-------
r,b,swpd,free,buff,cache,si,so,bi,bo,in,cs,us,sy,id,wa,st,gu
1,0,999208,3603968,353020,8258580,0,0,22,166,1016,2,4,1,94,0,0,0
我之前的awk通常用来打印某一列的内容,例如cat vmstat.dat | awk -F' ' '{print $1}'
打印$1第一列, -F设定分隔符。除此之外还能遍历查找。
例如查找'TARGET'
awk '{for(i=1;i<=NF;i++)if($i=="TARGET"){print "TARGET located at row: " NR " col: " i }}'
这里NR表示当前的行
, NF代表当前的列
。 NR还可以用来指定要处理的行号,例如vmstat中有两个header。例如
1. 只打印第二行的命令是awk '(NR==2){print $0}'
2. 要知道'id'在哪一列,则遍历第二行的每一列数据 awk '(NR==2){for(i=1;i<NF;i++) if($i=="id"){print i}'
会打印'15'
3. 要打印id这列的第一行的值, awk '(NR==2){for(i=1;i<NF;i++) if($i=="id"){getline; print $i}'
会打印'94'
要打印id这列的第N行的值时,就循环调用getline N次。
体验了 openpyxl, 接口设计非常自然, 功能也非常全, 包括合并格子、字体、颜色。典型用法如下
import openpyxl
workbook = openpyxl.load_workbook('output.xlsx')
print(workbook.sheetnames)
sheet = workbook['Sheet']
print(sheet.max_row) #总行数
print(sheet.max_column) #总列数
print(sheet.dimensions) #左上角:右下角
cell1 = sheet['A1']
print(cell1.row) #1
print(cell1.column) #A
print(cell1.coordinate) #A1
# 单元格的内容
print(cell1.value)
# 遍历
#先行后列
for row in sheet.iter_rows():
for c in row:
print(f"[{c.row}][{c.column}] = {c.value}")
#先列后行
for col in sheet.iter_cols():
for r in col:
print(f"[{r.row}][{r.column}] = {r.value}")
# 保存
workbook.save('test.xlsx')
openpyxl还能插入图片,感觉非常棒
from openpyxl import Workbook
from openpyxl.drawing.image import Image
wb = Workbook()
ws = wb.active
ws['A1'] = 'You should see three logos below'
# create an image
img = Image('logo.png')
# add to worksheet and anchor next to cells
ws.add_image(img, 'A1')
wb.save('logo.xlsx')
https://openpyxl.readthedocs.io/en/stable/images.html
Currently, xlwings can only run as a remote interpreter on Linux, see: https://docs.xlwings.org/en/stable/remote_interpreter.html
There are plans to add read/write capabilities, but for now you have to use one of OpenPyXL, XlsxWriter, or pyxlsb,
类比Excel使用vba来编程,Libreoffice sheet使用Basic,同时也支持 python
在chatgpt帮助下写了个url转图片的BASIC脚本
REM ***** BASIC *****
Option Explicit
Sub InsertAndResizeImages()
Dim oSheet As Object
DIM oCell as Object
Dim oGraphic As Object
Dim i As Long
Dim imageURL As String
Dim oSize As New com.sun.star.awt.Size
Dim oColumns as Object
Dim oRows As Object
oSheet = ThisComponent.CurrentController.getActiveSheet()
oColumns = oSheet.getColumns()
oColumns.getByIndex(0).Width = 3200
i = 160
Do
oCell = oSheet.getCellByPosition(0, i)
oRows = oSheet.getRows()
oRows.getByIndex(i).Height = 3200
imageURL = oCell.String
If imageURL = "" Then Exit Do
'Debug.Print oCell.String
'MsgBox imageURL
On Error GoTo NextImage
oGraphic = ThisComponent.createInstance("com.sun.star.drawing.GraphicObjectShape")
oGraphic.GraphicURL = imageURL
oGraphic.Position = oCell.Position
oSheet.DrawPage.add(oGraphic)
oSize.Width = 3200
oSize.Height = 3200
oGraphic.Size = oSize
Wait 500 ' Pauses for 0.5 seconds
Exit Do
NextImage:
i = i + 1
Loop
End Sub
libreoffice 的UNO API是独立于编程语言的,BASIC可以直接用 ThisComponent
,而python需要通过'XSCRIPTCONTEXT'来用,获取sheet的BOOK对象方法是doc = XSCRIPTCONTEXT.getDocument()
有个Basic和Python的对比写法,总结是很相似。用python重写BASIC代码就是
def insert_and_resize_images():
from com.sun.star.awt import Size
import time
doc = XSCRIPTCONTEXT.getDocument()
sheet = doc.CurrentController.ActiveSheet
columns = sheet.getColumns()
columns.getByIndex(0).Width = 3200
i = 160
while True:
cell = sheet.getCellByPosition(0, i)
rows = sheet.getRows()
rows.getByIndex(i).Height = 3200
image_url = cell.String
if image_url == "":
break
try:
graphic = doc.createInstance("com.sun.star.drawing.GraphicObjectShape")
graphic.GraphicURL = image_url
graphic.Position = cell.Position
sheet.DrawPage.add(graphic)
size = Size()
size.Width = 3200
size.Height = 3200
graphic.Size = size
time.sleep(0.5) # 0.5 seconds pause
except:
pass # Continue on error
i += 1
无论是excel的xlwings还是libreoffice sheet的pyUNO, 都不能独立使用,都要安装套件。